In Amazon S3, a bucket is a fundamental container for storing objects. Each bucket has a globally unique name across all AWS accounts and regions. Buckets serve as:
- The top-level namespace for object storage
- The boundary for access control policies
- The scope for lifecycle rules and versioning configurations
aws s3api create-bucket \
--bucket my-unique-bucket-name \
--region us-west-2 \
--create-bucket-configuration LocationConstraint=us-west-2
Unlike traditional file systems, S3 doesn't actually have folders. What appears as folders in S3 clients is actually:
- Object keys with prefixes and delimiters (usually '/')
- A presentation layer convention adopted by most S3 clients
- Not a physical directory structure
// These appear as "folders" but are just key prefixes
s3://my-bucket/photos/2023/
s3://my-bucket/photos/2023/january/photo1.jpg
Feature | Bucket | Folder (Prefix) |
---|---|---|
Uniqueness | Globally unique name | Non-unique within bucket |
Creation | Explicit API call | Implied by object key |
Permissions | Bucket policies apply | Inherits bucket permissions |
Cost | No direct charge | No additional charge |
Limits | 100 per AWS account | Unlimited |
Here's how to list objects with a specific prefix (what appears as a folder):
import boto3
s3 = boto3.client('s3')
response = s3.list_objects_v2(
Bucket='my-bucket',
Prefix='photos/2023/',
Delimiter='/'
)
for content in response.get('Contents', []):
print(content['Key'])
# Common prefixes would be returned in 'CommonPrefixes'
# which represents what clients show as "subfolders"
1. Performance considerations: S3 scales better with flatter structures. Avoid deeply nested prefixes.
2. Listing operations: Listing objects with prefixes is more efficient than scanning entire buckets.
3. Lifecycle rules: Can be applied to prefixes but not to "folders" as entities.
# Example lifecycle rule for a prefix
lifecycle_config = {
'Rules': [
{
'ID': 'ArchivePhotos',
'Filter': {
'Prefix': 'photos/'
},
'Status': 'Enabled',
'Transitions': [
{
'Days': 30,
'StorageClass': 'STANDARD_IA'
}
]
}
]
}
s3.put_bucket_lifecycle_configuration(
Bucket='my-bucket',
LifecycleConfiguration=lifecycle_config
)
Despite not being real folders, the prefix convention is useful for:
- Organizing related objects (logs by date, images by category)
- Applying common permissions via IAM policies on prefixes
- Implementing logical partitioning for analytics tools like Athena
# Example IAM policy granting access to a specific prefix
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::my-bucket/department/engineering/*"]
}
]
}
In Amazon S3, buckets
are fundamental containers for storing objects, while folders
are essentially object key prefixes that simulate a hierarchical structure. Unlike traditional file systems, S3 is a flat object storage system where folders don't physically exist as separate entities.
Buckets serve as the top-level namespace for your S3 resources with these characteristics:
- Require globally unique names across all AWS accounts
- Have region-specific locations
- Support versioning, logging, and lifecycle policies at bucket level
- Enforce security policies and access controls
What appears as folders in S3 management consoles is actually just key naming conventions. For example:
// These represent "folders" in the UI
photos/vacation/beach.jpg
documents/reports/q1.pdf
The flat structure affects operations in several ways:
// List objects with a specific prefix (simulating folder contents)
aws s3 ls s3://my-bucket/documents/ --recursive
// Delete "folder" contents (which actually deletes objects with prefix)
aws s3 rm s3://my-bucket/documents/ --recursive
Buckets are ideal when you need:
- Separate security domains
- Different lifecycle rules
- Distinct logging requirements
Prefixes (folders) work better for:
- Organizing related objects
- Applying batch operations
- Maintaining logical separation
For large-scale systems, consider these approaches:
// Partitioning strategy for analytics
s3://data-lake/raw/year=2023/month=03/day=15/logfile.json
// Using S3 Inventory to track "folder" contents
{
"InventoryConfiguration": {
"Destination": {
"S3BucketDestination": {
"Bucket": "arn:aws:s3:::inventory-bucket",
"Prefix": "inventory"
}
}
}
}
While S3 scales infinitely, prefix design affects:
- List operations performance
- Request rate scalability
- Parallel processing efficiency
Avoid overly deep nesting (more than 5-7 levels) for optimal performance.