Why You Should Be Ingesting AWS VPC Flow Logs

TL;DR To make ingesting noisy VPC Flow logs affordable, invest in filtering, enrichment, and tuning, alongside adopting a cloud-native SIEM with zero infrastructure management.

Security practitioners who monitor their cloud networks with VPC Flow know that this AWS service can get excessively expensive, and quickly. 

This blog will guide you on how to manage VPC Flow logs so that you can not only control costs, but decongest your threat detection environment and enhance the fidelity of your alerts. To this end, you’ll learn how to filter out irrelevant log data, enrich logs with context and threat intelligence, and boost your overall signal-to-noise ratio with detection-as-code. You’ll also understand the limitations of SIEM infrastructure and licensing that may prevent you from cost-effective ingestion.

Let’s start by assessing the core issue—noisy logs.

The Problem of Noisy Logs

When a log source rapidly produces large quantities of logs, practitioners dub these as “noisy” to reflect the challenges these log sources present.

The first challenge is about threat detection. Any number of irrelevant logs congests the threat detection environment, making the job of accurately identifying threats that much harder. A congested environment can lead to an increase in false positives, mean time to resolve (MTTR), and alert fatigue, all of which increase risk. The March 2023 attack on 3CX’s supply chain is a reminder of the consequences of alert fatigue and why teams must address it.

The second challenge is a matter of cost. Quite plainly, the more data your Security Information and Event Management (SIEM) platform ingests, the higher the costs will be. But with the volume of VPC Flow logs, expensive easily becomes cost-prohibitive, forcing security teams to stop ingesting and put comprehensive security at risk. 

However, a big culprit driving up spending is SIEM infrastructure and licensing models that are unfriendly to noisy, high-volume cloud logs. To maintain a baseline of performance, traditional SIEMs require ongoing database management to adjust how logs are indexed based on write frequency and infrastructure hardware profile. Despite this overhead, only a small subset of data remains in hot storage for rapid search. Most data resides in warm or cold storage that costs less, but takes much longer to query during incident response and threat hunting. 

Facing these obstacles on top of additional licensing fees for cloud logs, security teams often do not ingest VPC Flow logs into their SIEM, but resort instead to siloing their log data in an S3 bucket, if at all.

Why You Should Be Ingesting AWS VPC Flow Logs

With Amazon’s Virtual Private Cloud (VPC), you can create a virtual network to house your AWS resources with all the typical network controls, like the ability to configure gateways, subnets, IP addresses, and routing. These controls are not just fundamental to network administration, but also for preventing network-based attacks.

VPC Flow logs give you visibility into the IP traffic going in between interfaces on your VPC network, enabling you to diagnose issues with security policies and decide how to direct traffic between interfaces. Further, you can correlate this data with users or devices to assist in determining malicious activity. Here are just a few examples of the threats you can detect with VPC Flow logs:

  • Use of non-standard ports. Using a non-standard port (SMTPS) for a standard request (HTTPS) could indicate that an attacker is trying to communicate with compromised systems on your network. Practitioners can catch these by monitoring inbound traffic that does not meet criteria in allowlists or blocklists.
  • Requests from known crypto mining domains. If your network receives a DNS lookup request from a known crypto mining domain name, this could indicate an attacker seeking to hijack resources on your network to perform resource-intensive crypto-mining.
  • Traffic to non-approved DNS servers. When outbound DNS traffic is detected to a non-approved DNS server, this could indicate an attacker is seeking to exfiltrate data or perform command and control for compromised hosts.

To identify and tackle the most advanced and emerging threats, practitioners need their SIEM to ingest VPC Flow logs alongside all other log sources. Now let’s tackle how to manage the challenges caused by noisy VPC Flow logs.

Migrate to a Cloud-Based SIEM to Control Costs

If you are still using a traditional SIEM, migrating to a cloud-based SIEM will have the most significant impact on your ability to cost-effectively ingest AWS logs and gain full system visibility. 

Here’s why: a cloud-based SIEM leverages serverless architecture to eliminate infrastructure management; further, log data is saved in a security data lake that uses modern cloud database technology designed to not only handle cloud-scale data, but guarantee query performance. This enables cloud-based SIEM platforms to deliver exactly what modern security teams need:

  • Petabyte-scale search. By separating how data is stored and computed, the security data lake provides hot storage going as far back as 365 days for rapid investigation and response, even when running concurrent workloads.
  • Unified data model (UDM). A UDM configures and standardizes a set of unified fields across all log types, making this work optional rather than required.
  • Instant scalability. With a security data lake, you can scale resources up or down as needed for incident response and threat hunting.
  • Fully managed data. The security data lake includes data encryption, compression, governed policies and rules, data-masking, automated JSON parsing, and more, reducing data management workloads. 

Filter Raw Log Data to Reduce Noise and Control Costs

Short of adopting a cloud-native SIEM, the best way to manage AWS VPC Flow noise and cost is to filter out unwanted data. There are a few ways to do this depending on how your log pipeline is set up.

Within AWS, the VPC Flow service does not provide log filtering tools. However, if your log pipeline uses AWS CloudWatch to aggregate and forward logs, you can use subscription filters to control which logs are sent to your SIEM. 

Within your SIEM, you can create filters by log source to ignore—or throw out—log data before it’s analyzed for threats and saved to your data lake. Two common methods that SIEMs offer are raw data filters and normalized data filters.

A raw data filter processes and filters log data before the SIEM normalizes the data according to its schemas. A raw data filter specifies a pattern to match with. If any log matches the pattern, the entire log is ignored and not processed by the SIEM. The available tools for filtering raw log data depends on the SIEM you are working with, but these are the two most common:

  • Regular expressions (regex). A sequence of characters that represent a pattern that searches for, matches with, and manipulates strings in text. For example, you could specify the regex /(\bevent\b/g to filter out any raw log data that contains the exact whole word “event”. The text “no event” would be thrown out, and “noevent” and “no events” would be processed.
  • Substrings. Text, or a sequence of characters, used as a matcher when analyzing other text. For example, the substring “” could be used to filter out any raw log data that contains “” at least once. The text “” would be thrown out, but “” would be processed.

In contrast, a normalized data filter processes and filters log data after the SIEM normalizes the data. The benefit here is a granular filter that can throw out individual fields—or “keys”, pieces of data within a log—instead of the entire log. 

To create a normalized data filter, you’ll typically need the following information:

  • The log type. For example, AWS VPC Flow.
  • A field. The identifier or “key” for the data you wish to remove.
  • A condition. For example, “contains”, “is less than”, or “has any”.
  • A value. The state of the data for the field that you want to remove. This could include having no data. 

With both filtering methods, any logs that are dropped during the filtering process should not contribute to your overall ingestion quota; verify pricing with your SIEM vendor.

An Example: Filtering Out Empty Events

Filtering starts by determining what information is irrelevant to security. For an example, let’s see how to filter out empty VPC Flow log data.

When there is no network traffic to or from the network interface during a VPC Flow aggregation interval, VPC Flow generates a default log containing the statement “NODATA”, representing an “empty” log that has no data. To target these empty logs and filter them out using a raw data filter, use either the regex filter /NODATA/ or the substring filter “NODATA”.

Enrich Logs to Increase Alert Fidelity

Alongside filtering, enriching your log data with context and threat intelligence increases the fidelity of your alerts and speeds up investigation and incident response. A classic example is adding information about business assets to log data, like a user-to-hardware mapping. Another example is mapping numeric IDs or error codes to human readable information.

To enrich logs, create a lookup table, a custom data set that you upload to your SIEM and configure to enrich one or more specified log types. The next image shows how this process works, where a lookup table of known bad actors enriches an incoming log by the matching IP address

Your SIEM may also partner with third-party threat intelligence providers for out-of-the-box log enrichment. For example, your SIEM may provide a regularly updated list of Tor Exit Nodes. Tor is an anonymizing network for Internet browsing in which the user’s client IP address—or exit node—is randomly picked from nodes around the world. A regularly updated list of Tor exit nodes can help you determine if a request is coming from Tor and whether it could be malicious.

Boost Signal-to-Noise Ratio with Detections-as-Code

Filtering enables you to reduce noise and control cost, and log enrichment improves the fidelity of your alerts. But another essential task is to increase your signal-to-noise ratio by tuning detections. 

Detection tuning is the process of customizing detections so they are optimized for your specific environment. This ensures that detections are specific, informative, and cover relevant security threats, so that you can accurately identify threats and resolve them faster. In terms of metrics, detection tuning controls alert fatigue, reduces false positives and false negatives, and improves two key performance metrics for threat detection and response: mean time to detect (MTTD) and mean time to resolve (MTTR). 

In other words, you’ll clearly hear the signal amid the noise.

But the ability to customize detections varies across SIEMs. It’s well known that legacy SIEMs suffer from inflexible tools that limit the extent to which you can tune and optimize detections for your environment. Platforms that offer detection-as-code (DaC) are making customization and flexibility fundamental to detection creation by writing, managing, and deploying detections through code. The goal is to make threat detection processes reusable, reliable, consistent, and scalable, all while controlling cost:

  • Reusability. Detections are written in programming languages that allow for the reuse and adaptation of detection logic across your detection environment and new contexts. A common practice is encapsulating routine processes in functions to use again and again, saving you time and ensuring consistency. 
  • Reliability. Coding languages like Python offer extensive built-in tools and libraries for extending or customizing logic so you can cover security gaps and ensure reliable threat detection. Reliability is further verified by writing code-based unit tests that accompany the detection rules. Detection code is managed in version control systems like GitHub, facilitating reliable and accessible change tracking, auditing, and rollbacks.
  • Consistency. Version control systems facilitate testing and deployment automation by integrating into continuous integration and continuous deployment (CI/CD) pipelines. Automation ensures consistency by reducing manual errors. These systems also provide tools like pull requests for standardized peer review and collaboration, maintaining consistent quality in threat detection development.
  • Scalability. The automated testing and deployment of detection rules allow for management that scales with infrastructure growth. Code-based detections are reusable and adaptable to changing data sources, supporting scalability. Version control facilitates rapid development and deployment of new rules to meet emerging threats, enhancing scalable threat detection capabilities.

Now get a sense of what a code-based detection looks like. The next image shows an excerpt from a Python detection for clients that may be performing crypto mining, one of Panther’s 500+ pre-built detections. The logic in this rule will trigger an alert when log data contains a domain that’s in the predefined list of CRYPTO_MINING_DOMAINS.

In the code excerpt, notice rstrip (line 7); this is a built-in Python function for routine string manipulation, in this case removing trailing characters from a string. Access to reusable code is one of the benefits of writing detections in code discussed earlier. What’s not shown in this excerpt is the logic that defines what information goes into the alert, tests for the detection, and other ways to customize how and when the alert is triggered. 

Best of all, the benefits of DaC—including expressiveness, testability, CI/CD integration, and reusability—are available to practitioners without programming experience. Security teams can implement no-code workflows using forms or a user-friendly markup language to create, test, deploy, and automate detections. To get a closer look at DaC, including no-code workflows, check out how to create a code-based detection.

To summarize, detection-as-code is all about efficiency and reliability; it gives security practitioners the flexibility to optimize their detections and the agility to stay on top of threats in a dynamic cybersecurity landscape.

A Cloud-Native SIEM

When you need to manage high-volume AWS logs, the most important tool is a cloud-native SIEM that can handle cloud-scale data. Traditional SIEMs can’t keep up with the demands of cloud workloads, compromising on timeliness and cost. Learn how to evaluate security platforms to determine whether it’s fit for your cloud data.

For a comprehensive review on managing AWS logs, read the ebook Keep AWS Logs from Running Wild by Putting Panther in Charge. Panther is a cloud-native SIEM that empowers modern security teams with real-time threat detection, log aggregation, incident response, and continuous compliance, at cloud-scale. 

Ready to try Panther? Get started by requesting a demo.

Recommended Resources

Escape Cloud Noise. Detect Security Signal.
Request a Demo