Why You Should Be Ingesting AWS CloudTrail Logs

TL;DR For cost-effective CloudTrail log ingestion, invest in filtering, enrichment, and tuning, alongside adopting a cloud-native SIEM with zero infrastructure management.

Cybersecurity practitioners label log sources as “noisy” when they produce large quantities of logs, and quickly. This label is meant to be negative, representing log sources that you need to manage in order to prevent serious problems like excessive ingestion costs and a threat detection environment that’s clogged with irrelevant information. 

This blog will guide you on tackling these problems with AWS CloudTrail logs. You’ll understand the limitations of SIEM infrastructure and licensing that may prevent you from cost-effective ingestion. You’ll also learn practical methods to control costs and improve threat detection:  filtering out irrelevant log data and boosting your signal-to-noise ratio with log enrichment and detection-as-code.

Why you should be ingesting AWS CloudTrail logs

AWS CloudTrail logs enable you to identify actions taken within your AWS infrastructure, including who or what performed the action, on which resource, and when. CloudTrail logs provide this information by recording events that occur in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs. The broad scope in logging makes these logs a goldmine of critical security data. Let’s dig into a few examples of the security information that CloudTrail logs report:

  • Visibility of CodeBuild projects. When an AWS CodeBuild project is made public, this could indicate an exfiltration attack where adversaries may be stealing data from your network. 
  • Identity Access Management (IAM) events. When IAM reports an event about a user assuming a role that was explicitly blocklisted for manual user assumption, this could indicate an attack through privilege escalation.
  • Lambda or ECR events. An unauthorized create, read, update, or delete (CRUD) event to an AWS Lambda or ECR service could indicate an attack to implant cloud or container images with malicious code and establish persistence after gaining access to an environment.
  • Web Application Firewall (WAF) events. If an AWS WAF dissociates a web access control list (WACL) from a source, this could indicate a denial of service (DoS) attack over a network.

This is just a short list of examples that highlight the critical security information reported in CloudTrail logs. In order for you to have visibility into your AWS infrastructure, ingesting CloudTrail logs into your Security Information and Event Management (SIEM) platform is not optional, but a basic necessity. 

Challenges with noisy logs

It is well-known that CloudTrail logs are voluminous and produce large quantities of logs. 

Just like popups, ads, and notifications degrade your concentration and productivity, noisy logs overwhelm practitioners and congest the threat detection environment. This leads to well-known issues that increase risk: alert fatigue, false positives, and slow mean time to resolve (MTTR). The March 2023 attack on 3CX’s supply chain is a solid example of how dangerous alert fatigue can be.

Equally problematic is the issue of cost. The sheer volume of AWS CloudTrail logs makes ingestion very expensive, if not cost-prohibitive, which challenges the basic requirement for visibility in threat detection. While more logs always require greater storage capacity, the real culprits driving up spending are SIEM infrastructure and licensing structures that aren’t friendly to high-volume cloud logs.

In order to maintain a baseline of performance, traditional SIEMs require ongoing database management to adjust how logs are indexed based on the log write frequency and infrastructure hardware profile. Despite this management overhead, only a small subset of your data remains in hot storage for rapid search. Most data resides in warm or cold storage that uses fewer resources, but takes longer to query during incident response and threat hunting. When faced with these challenges on top of additional licensing fees for cloud logs, security teams often don’t ingest CloudTrail logs into their SIEM, resorting instead to siloing their log data in an S3 bucket, if at all.

Migrate to a cloud-based SIEM to control costs

If you are still using a traditional SIEM, migrating to a cloud-based SIEM will have the most significant impact on your ability to cost-effectively ingest AWS logs and gain full system visibility. 

Here’s why: a cloud-based SIEM leverages serverless architecture to eliminate infrastructure management; further, log data is saved in a security data lake that uses modern cloud database technology designed to not only handle cloud-scale data, but guarantee query performance. This enables cloud-based SIEM platforms to deliver exactly what modern security teams need:

  • Petabyte-scale search. By separating how data is stored and computed, the security data lake provides hot storage going as far back as 365 days for rapid investigation and response, even when running concurrent workloads.
  • Unified data model (UDM). A UDM configures and standardizes a set of unified fields across all log types, making this work optional rather than required.
  • Instant scalability. With a security data lake, you can scale resources up or down as needed for incident response and threat hunting.
  • Fully managed data. The security data lake includes data encryption, compression, governed policies and rules, data-masking, automated JSON parsing, and more, reducing data management workloads. 

Filter raw log data to reduce noise and control costs

Short of adopting a cloud-native SIEM, the best way to manage AWS CloudTrail noise and cost is to filter out unwanted data. There are a few ways to do this depending on how your log pipeline is set up.

Within AWS, you can configure CloudTrail using event selectors and advanced event selectors to identify which events you want a trail to log. The AWS docs explain, “for each trail, if the event matches any event selector, the trail processes and logs the event.” Any other events are filtered out. 

However, if your log pipeline uses AWS CloudWatch to aggregate and forward logs, you may prefer to use subscription filters to control which logs are sent to your SIEM. 

Within your SIEM, you can create filters by log source to ignore—or throw out—log data before it’s analyzed for threats and saved to your data lake. Two common methods are raw data filters and normalized data filters. With both methods, any logs that are dropped during the filtering process should not contribute to your overall ingestion quota; verify pricing with your SIEM vendor. 

A raw data filter processes and filters log data before the SIEM normalizes the data according to its schemas. A raw data filter specifies a pattern to match with. If any log matches the pattern, the entire log is ignored and not processed by the SIEM. The available tools for filtering raw data depends on the SIEM you are working with, but these are the two most common:

  • Regular expressions (regex). A sequence of characters that represent a pattern that searches for, matches with, and manipulates strings in text. For example, you could specify the regex /\d/g to filter out any raw log data that contains a number. The text “5 events” would be thrown out, and “five events” would be processed.
  • Substrings. Text, or a sequence of characters, used as a matcher when analyzing other text. For example, the substring “get” could be used to filter out any raw log data that contains “get” at least once. The text “get event” would be thrown out, but “post event” would be processed.

In contrast, a normalized data filter processes and filters log data after the SIEM normalizes the data. The benefit here is a granular filter that can throw out individual fields—or “keys”, pieces of data within a log—instead of the entire log. 

To create a normalized data filter, you’ll typically need the following information:

  • The log type. For example, AWS CloudTrail.
  • A field. The identifier or “key” for the data you wish to remove.
  • A condition. For example, “contains”, “is less than”, or “has any”.
  • A value. The state of the data for the field that you want to remove. This could include having no data.

With both filtering methods, any logs that are dropped during the filtering process should not contribute to your overall ingestion quota; verify pricing with your SIEM vendor.

An example: Filtering out irrelevant S3 events

Filtering starts by determining what information is irrelevant to security. For an example, let’s work with S3 buckets.

There are a variety of actions that you can make on an S3 bucket that gets reported as a CloudTrail data event. The action HeadBucket is useful when you need to see if a bucket exists and if you have permission to access it, but tracking these events is not vital to security. To prevent raw log data that contains HeadBucket events from being processed in your SIEM, create a filter with the regex /\”eventName|”:|”HeadBucket\”.

With an AWS CloudTrail event selector, do the opposite: identify the events that you want your trail to process, such as DeleteBucket.

Enrich logs to increase alert fidelity

Alongside filtering, enriching your log data with context and threat intelligence increases the fidelity of your alerts and speeds up investigation and incident response. A classic example is adding information about business assets to log data, like a user-to-hardware mapping. Another example is mapping numeric IDs or error codes to human readable information.

To enrich logs, create a lookup table, a custom data set that you upload to your SIEM and configure to enrich one or more specified log types. The next image shows how this process works, where a lookup table of known bad actors enriches an incoming log by the matching IP address 1.1.1.1.

Your SIEM may also partner with third-party threat intelligence providers for pre-configured log enrichment. For example, GreyNoise is a threat intelligence provider that collects, analyzes, and labels Internet-wide data on IP addresses to identify noise—irrelevant or harmless activity—that saturates security tools. When AWS S3 object-level logging is enabled for a given bucket, GreyNoise can identify S3 operations from known malicious classifications.  

Boost signal-to-noise ratio with detection-as-code

Filtering enables you to reduce noise and control cost, and log enrichment improves the fidelity of your alerts. But another essential task is to increase your signal-to-noise ratio by tuning detections. 

Detection tuning is the process of tailoring detections so they are optimized for your specific environment. This ensures that detections are specific, informative, and cover relevant security threats, so that you can accurately identify threats and resolve them faster. This approach controls alert fatigue, reduces false positives, and improves two key performance metrics for threat detection and response: mean time to detect (MTTD) and mean time to resolve (MTTR). 

But the ability to customize detections varies across SIEMs. It’s well known that legacy SIEMs suffer from inflexible tools that limit the extent to which you can tune and optimize detections for your environment. Platforms that offer detection-as-code (DaC) are making customization and flexibility fundamental to detection development by writing, managing, and deploying detections through code. The goal is to make threat detection consistent, reliable, reusable, and scalable, all while controlling cost and providing the flexibility and customization you need to increase alert fidelity:

  • Written with widely-used programming languages. Coding languages like Python are expressive and flexible. They give you access to a host of built-in tools and libraries that provide useful out-of-the-box functionality, like crunching numbers, or grouping routine processes or information into reusable functions or variables. This makes writing detections a flexible process where it’s easy to customize detections and cover security gaps.
  • Testing and quality assurance (QA). Teams can create code-based unit tests that sit alongside the code-based detection rule. Code-based unit tests verify the efficacy and reliability of detection logic, ensuring correct, relevant, and high-fidelity alerts.
  • Managed in version control. Managing detection rules in version control facilitates effective change tracking, rollbacks, and audits. Coded detections are self-documenting and anyone with access can easily understand the detection environment. These systems also provide tools like pull requests for standardized peer review and collaboration, so you can maintain consistent quality in threat detection development.
  • Automated operations. Integrating DaC into CI/CD pipelines automates testing and deployment, streamlining operations and ensuring consistency and scalability by eliminating manual work.
  • No-code workflows. Security teams can benefit from DaC—including expressiveness, testability, CI/CD integration, and reusability—without requiring every team member to have programming experience. Practitioners can implement a console-based workflow using forms or a user-friendly markup language to create, test, deploy, and automate detections. 

Check out the next image to get a sense of what a code-based detection looks like. 

The image shows an excerpt from a Python detection for Identity Access Management (IAM), one of Panther’s 500+ pre-built detections. The logic in this rule will trigger an alert when a user successfully assumes a role ARN (roleArn) that’s defined in ASSUME_ROLE_BLOCKLIST, a predefined list of blocklisted roles.

In the code excerpt, notice aws_cloudtrail_sucess (line 12) and deep_get (line 16); these are custom helper functions that encapsulate routine processes to be reused across your detections, one of the benefits of writing detections in code discussed earlier. What’s not shown in this excerpt is the logic that defines what information goes into the alert, tests for the detection, and other ways to customize how and when the alert is triggered. To get a closer look at DaC, including no-code workflows, check out how to create a code-based detection.

To summarize, detection-as-code is all about efficiency and reliability; it gives security practitioners the flexibility to optimize their detections and the agility to stay on top of threats in a dynamic cybersecurity landscape.

Built for the cloud

Traditional SIEMs struggle with the demands of cloud workloads, often compromising on timeliness and cost. Whether you control cost and noise with filtering, or boost your signal-to-noise ratio with detection-as-code, choose a SIEM that is built to handle cloud-scale data, without compromise.

For a comprehensive review on managing AWS logs, read the ebook Keep AWS Logs from Running Wild by Putting Panther in Charge. Panther is a cloud-native SIEM that empowers modern security teams with real-time threat detection, log aggregation, incident response, and continuous compliance, at cloud-scale. 

Ready to try Panther? Get started by requesting a demo.

Recommended Resources

Escape Cloud Noise. Detect Security Signal.
Request a Demo