Editor’s note: This post was originally published in January 2020 and has been last updated for accuracy and comprehensiveness in February 2021.
AWS S3 is an extraordinary and versatile data store that promises great scalability, reliability, and performance. Yet, S3 bucket security continues to be in the news for all the wrong reasons—from the leak involving exposure of 200mn US voters’ preferences in 2017 to the massive data leaks of social media accounts in 2018, and the infamous ‘Leaky Buckets’ episode in 2019 shook some of the largest organizations including Capital One, Verizon, and even defense contractors. It’s almost impossible to not notice that such data leaks over the years are almost always a result of unsecured S3 Buckets.
This article is the second installment of our AWS security logging-focused tutorials to help you monitor S3 buckets with a special emphasis on object-level security (read the first one here). You will discover how an in-depth monitoring based approach can go a long way in enhancing your organization’s data access and security efforts. Using practical instructions, we will walk through everything you need to know to configure S3 bucket access logging, along with CloudFormation samples to kick-start the process.
S3 bucket access logging captures information on all requests made to a bucket, such as PUT, GET, and DELETE actions. Bucket access logging is a recommended security best practice that can help teams with upholding compliance standards or identifying unauthorized access to your data. In particular, S3 access logs will be one of the first sources required in any data breach investigation as they track data access patterns over your buckets.
Before we begin, let’s make sure to have the following prerequisites in place:
Next, let’s review some terminology:
S3 bucket access logging is configured on the source bucket by specifying a target bucket and prefix where access logs will be delivered. It’s important to note that target buckets must live in the same region and account as the source buckets.
To create a target bucket from our predefined CloudFormation templates, run the following command from the cloned tutorials folder:
$ make deploy \
tutorial=aws-security-logging \
stack=s3-access-logs-bucket \
region=us-east-1
Code language: Shell Session (shell)
This will create a new target bucket with the LogDeliveryWrite ACL to allow logs to be written from various source buckets.
Next, let’s configure a source bucket to monitor by filling out the information in the aws-security-logging/access-logging-config.json
file:
{
"LoggingEnabled": {
"TargetBucket": "<AccountId>-s3-access-logs-<Region>",
"TargetPrefix": "<Source-Bucket-Name>/"
}
}
Code language: JSON / JSON with Comments (json)
Then, run the following AWS command to enable monitoring:
$ aws s3api put-bucket-logging \
--bucket <Source-Bucket-Name> \
--bucket-logging-status file://logging.json
Code language: SQL (Structured Query Language) (sql)
To validate the logging pipeline is working, list objects in the target bucket with the AWS Console:
The server access logging configuration can also be verified in the source bucket’s properties in the AWS Console:
Next, we will examine the collected log data.
S3 Access log files are written to the bucket with the following format:
TargetPrefixYYYY-mm-DD-HH-MM-SS-UniqueString
Where:
TargetPrefix
is what we specified in the access-logging-config.json
fileYYYY-mm-DD-HH-MM-SS
is the date/time in UTC when the log file was deliveredIt’s also important to understand that log files are written on a best-effort basis, meaning on rare occasions the data may never be delivered.
S3 access logs are written with the following space-delimited format:
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
test-bucket [31/Dec/2019:02:05:35 +0000] 63.115.34.165 - E63F54061B4D37D3 REST.PUT.OBJECT test-file.png
"PUT /test-file.png?X-Amz-Security-Token=token-here&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20191231T020534Z&X-Amz-SignedHeaders=content-md5%3Bcontent-type%3Bhost%3Bx-amz-acl%3Bx-amz-storage-class&X-Amz-Expires=300&X-Amz-Credential=ASIASWJRT64ZSKVRP62Z%2F20191231%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=XXX
HTTP/1.1" 200 - - - 1 - "https://s3.console.aws.amazon.com/s3/buckets/test-bucket/?region=us-west-2&tab=overview"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" - Ox6nZZWoBZYJ/a/HLXYw2PVp1nXdSmqdp4fV37m/8SC54q7zTdlAYxuFOWYgOeixYT+yPs6prdc= - ECDHE-RSA-AES128-GCM-SHA256 -
test-bucket.s3.us-west-2.amazonaws.com TLSv1.2
Code language: YAML (yaml)
The following information can be extracted from this log to understand the nature of the request:
test-file.png
PUT
into test-bucket
31/Dec/2019:02:05:35 +0000
63.115.34.165
Mac OS X 10.15.2
laptop running Chrome 79The additional context we can gather from the log includes:
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be
is the bucket owner canonical user ID (an identifier for your account).us-west-2
, per the bucket FQDN test-bucket.s3.us-west-2.amazonaws.com
E63F54061B4D37D3
For a full reference of each field, check out the AWS documentation.
To gain a deeper understanding of S3 access patterns, we can use AWS Athena, which is a service to query data on S3 with SQL. The following tutorial from AWS can be used to quickly set up an Athena table to enable queries on our newly collected S3 access logs. Remember to point the table to the S3 bucket named <AccountId>-s3-access-logs-<Region>
.
Once configured, queries can be run such as:
Other types of helpful queries include:
Next, we’ll look into an alternative method for understanding S3 access patterns with CloudTrail.
AWS CloudTrail is a service to audit all activity within your AWS account. It has the ability to also monitor events such as GetObject
, PutObject
, or DeleteObject
on S3 bucket objects by enabling data event capture.
If you followed our previous tutorial on CloudTrail, then you are ready to go! If not, walk through it to set one up.
To enable data events from the CloudTrail Console, open the trail to edit, and then:
Now, when data is accessed in your bucket by authenticated users, CloudTrail will capture this context. To see the results use AWS Athena with the following sample query:
Additional SQL queries can be run to understand patterns and statistics.
Logging is an intrinsic part of any security operation including auditing, monitoring, and so on. That’s no different when working on AWS which offers two ways to log access to S3 buckets: S3 access logging and CloudTrail object-level (data event) logging. In this section, we will help you understand the differences between both, explore their functionalities, and make informed decisions when choosing one over the other.
Our recommendation is the following:
As attackers continue to discover vulnerabilities in Amazon S3 configurations, native cloud services on their own don’t offer the functionality needed to detect breaches and harden cloud infrastructure. Monitoring sensitive data in S3 requires end-to-end traffic visibility. By having complete visibility on how your data is accessed, you can create a robust strategy to monitor and secure S3 buckets.
Do you have a continuous monitoring strategy to detect any suspicious activity to your S3 buckets? Watch our On Demand Webinar: Detecting S3 Breaches with Panther to find out how to detect unauthorized activity to your buckets and learn in-depth techniques to monitor your S3 buckets.
To get started, contact us for a demo.