TL;DR To make ingesting noisy VPC Flow logs affordable, invest in filtering, enrichment, and tuning, alongside adopting a cloud-native SIEM with zero infrastructure management.
Security practitioners who monitor their cloud networks with VPC Flow know that this AWS service can get excessively expensive, and quickly.
This blog will guide you on how to manage VPC Flow logs so that you can not only control costs, but decongest your threat detection environment and enhance the fidelity of your alerts. To this end, you’ll learn how to filter out irrelevant log data, enrich logs with context and threat intelligence, and boost your overall signal-to-noise ratio with detection-as-code. You’ll also understand the limitations of SIEM infrastructure and licensing that may prevent you from cost-effective ingestion.
Let’s start by assessing the core issue—noisy logs.
When a log source rapidly produces large quantities of logs, practitioners dub these as “noisy” to reflect the challenges these log sources present.
The first challenge is about threat detection. Any number of irrelevant logs congests the threat detection environment, making the job of accurately identifying threats that much harder. A congested environment can lead to an increase in false positives, mean time to resolve (MTTR), and alert fatigue, all of which increase risk. The March 2023 attack on 3CX’s supply chain is a reminder of the consequences of alert fatigue and why teams must address it.
The second challenge is a matter of cost. Quite plainly, the more data your Security Information and Event Management (SIEM) platform ingests, the higher the costs will be. But with the volume of VPC Flow logs, expensive easily becomes cost-prohibitive, forcing security teams to stop ingesting and put comprehensive security at risk.
However, a big culprit driving up spending is SIEM infrastructure and licensing models that are unfriendly to noisy, high-volume cloud logs. To maintain a baseline of performance, traditional SIEMs require ongoing database management to adjust how logs are indexed based on write frequency and infrastructure hardware profile. Despite this overhead, only a small subset of data remains in hot storage for rapid search. Most data resides in warm or cold storage that costs less, but takes much longer to query during incident response and threat hunting.
Facing these obstacles on top of additional licensing fees for cloud logs, security teams often do not ingest VPC Flow logs into their SIEM, but resort instead to siloing their log data in an S3 bucket, if at all.
With Amazon’s Virtual Private Cloud (VPC), you can create a virtual network to house your AWS resources with all the typical network controls, like the ability to configure gateways, subnets, IP addresses, and routing. These controls are not just fundamental to network administration, but also for preventing network-based attacks.
VPC Flow logs give you visibility into the IP traffic going in between interfaces on your VPC network, enabling you to diagnose issues with security policies and decide how to direct traffic between interfaces. Further, you can correlate this data with users or devices to assist in determining malicious activity. Here are just a few examples of the threats you can detect with VPC Flow logs:
To identify and tackle the most advanced and emerging threats, practitioners need their SIEM to ingest VPC Flow logs alongside all other log sources. Now let’s tackle how to manage the challenges caused by noisy VPC Flow logs.
If you are still using a traditional SIEM, migrating to a cloud-based SIEM will have the most significant impact on your ability to cost-effectively ingest AWS logs and gain full system visibility.
Here’s why: a cloud-based SIEM leverages serverless architecture to eliminate infrastructure management; further, log data is saved in a security data lake that uses modern cloud database technology designed to not only handle cloud-scale data, but guarantee query performance. This enables cloud-based SIEM platforms to deliver exactly what modern security teams need:
Short of adopting a cloud-native SIEM, the best way to manage AWS VPC Flow noise and cost is to filter out unwanted data. There are a few ways to do this depending on how your log pipeline is set up.
Within AWS, the VPC Flow service does not provide log filtering tools. However, if your log pipeline uses AWS CloudWatch to aggregate and forward logs, you can use subscription filters to control which logs are sent to your SIEM.
Within your SIEM, you can create filters by log source to ignore—or throw out—log data before it’s analyzed for threats and saved to your data lake. Two common methods that SIEMs offer are raw data filters and normalized data filters.
A raw data filter processes and filters log data before the SIEM normalizes the data according to its schemas. A raw data filter specifies a pattern to match with. If any log matches the pattern, the entire log is ignored and not processed by the SIEM. The available tools for filtering raw log data depends on the SIEM you are working with, but these are the two most common:
In contrast, a normalized data filter processes and filters log data after the SIEM normalizes the data. The benefit here is a granular filter that can throw out individual fields—or “keys”, pieces of data within a log—instead of the entire log.
To create a normalized data filter, you’ll typically need the following information:
With both filtering methods, any logs that are dropped during the filtering process should not contribute to your overall ingestion quota; verify pricing with your SIEM vendor.
Filtering starts by determining what information is irrelevant to security. For an example, let’s see how to filter out empty VPC Flow log data.
When there is no network traffic to or from the network interface during a VPC Flow aggregation interval, VPC Flow generates a default log containing the statement “NODATA”, representing an “empty” log that has no data. To target these empty logs and filter them out using a raw data filter, use either the regex filter /NODATA/ or the substring filter “NODATA”.
Alongside filtering, enriching your log data with context and threat intelligence increases the fidelity of your alerts and speeds up investigation and incident response. A classic example is adding information about business assets to log data, like a user-to-hardware mapping. Another example is mapping numeric IDs or error codes to human readable information.
To enrich logs, create a lookup table, a custom data set that you upload to your SIEM and configure to enrich one or more specified log types. The next image shows how this process works, where a lookup table of known bad actors enriches an incoming log by the matching IP address 1.1.1.1.
Your SIEM may also partner with third-party threat intelligence providers for out-of-the-box log enrichment. For example, your SIEM may provide a regularly updated list of Tor Exit Nodes. Tor is an anonymizing network for Internet browsing in which the user’s client IP address—or exit node—is randomly picked from nodes around the world. A regularly updated list of Tor exit nodes can help you determine if a request is coming from Tor and whether it could be malicious.
Filtering enables you to reduce noise and control cost, and log enrichment improves the fidelity of your alerts. But another essential task is to increase your signal-to-noise ratio by tuning detections.
Detection tuning is the process of customizing detections so they are optimized for your specific environment. This ensures that detections are specific, informative, and cover relevant security threats, so that you can accurately identify threats and resolve them faster. In terms of metrics, detection tuning controls alert fatigue, reduces false positives and false negatives, and improves two key performance metrics for threat detection and response: mean time to detect (MTTD) and mean time to resolve (MTTR).
In other words, you’ll clearly hear the signal amid the noise.
But the ability to customize detections varies across SIEMs. It’s well known that legacy SIEMs suffer from inflexible tools that limit the extent to which you can tune and optimize detections for your environment. Platforms that offer detection-as-code (DaC) are making customization and flexibility fundamental to detection creation by writing, managing, and deploying detections through code. The goal is to make threat detection processes reusable, reliable, consistent, and scalable, all while controlling cost:
Now get a sense of what a code-based detection looks like. The next image shows an excerpt from a Python detection for clients that may be performing crypto mining, one of Panther’s 500+ pre-built detections. The logic in this rule will trigger an alert when log data contains a domain that’s in the predefined list of CRYPTO_MINING_DOMAINS.
In the code excerpt, notice rstrip (line 7); this is a built-in Python function for routine string manipulation, in this case removing trailing characters from a string. Access to reusable code is one of the benefits of writing detections in code discussed earlier. What’s not shown in this excerpt is the logic that defines what information goes into the alert, tests for the detection, and other ways to customize how and when the alert is triggered.
Best of all, the benefits of DaC—including expressiveness, testability, CI/CD integration, and reusability—are available to practitioners without programming experience. Security teams can implement no-code workflows using forms or a user-friendly markup language to create, test, deploy, and automate detections. To get a closer look at DaC, including no-code workflows, check out how to create a code-based detection.
To summarize, detection-as-code is all about efficiency and reliability; it gives security practitioners the flexibility to optimize their detections and the agility to stay on top of threats in a dynamic cybersecurity landscape.
When you need to manage high-volume AWS logs, the most important tool is a cloud-native SIEM that can handle cloud-scale data. Traditional SIEMs can’t keep up with the demands of cloud workloads, compromising on timeliness and cost. Learn how to evaluate security platforms to determine whether it’s fit for your cloud data.
For a comprehensive review on managing AWS logs, read the ebook Keep AWS Logs from Running Wild by Putting Panther in Charge. Panther is a cloud-native SIEM that empowers modern security teams with real-time threat detection, log aggregation, incident response, and continuous compliance, at cloud-scale.
Ready to try Panther? Get started by requesting a demo.