All Posts

Optimize CloudTrail Ingestion with Modern SIEM

Brandon Min

Amazon Web Services (AWS) offers numerous services to organizations from EC2 Instances (virtual machines), S3 Buckets (storage), VPC (networking), and RDS (database). Although AWS cloud options are expansive, each of these services presents a potential opportunity for exploitation by bad actors. 

CloudTrail makes it simple for security teams to enhance event log analysis by ingesting the logs into a modern SIEM. However, this can be a blessing and a curse. 

Due to the high volume of consolidated CloudTrail logs, many SIEMs that are based on ingestion pricing charge extremely high amounts of cost for ingesting data. The operating costs with legacy SIEM is also a backbreaking pain as security teams spend more time spinning up infrastructure to ingest data instead of digging into historical data and analysis. 

In this article, we’ll walk through the pains of ingesting logs while operating a legacy SIEM and how serverless, modern SIEM can change the game for security teams in search of key results at a low cost. 

Benefits of CloudTrail for a Security Team

CloudTrail event logs contain useful security information consolidated from multiple AWS services. These attributes include IP Addresses, API calls, ARNs, and Account ID information. Utilizing information from these events, security teams gain three immediate benefits: 

  1. Consolidated Ingestion Numerous AWS services can be consolidated into one log source of security monitoring. 
  2. Consistent Visibility for Security Analysis in AWS– with the breadth of information coming from CloudTrail, security teams can easily optimize the analysis of CloudTrail to get an encompassing view of their AWS environment. 
  3. Simple Detection Development – Due to a single log source being ingested, with a myriad of attributes, security engineers can easily leverage python to write powerful detections to monitor multiple types of AWS vulnerabilities. 

In order to take advantage of these three benefits, security teams must have an efficient method for ingesting CloudTrail logs into an external tool for analysis. Commonly, a SIEM has been used to complete this task. 

Legacy SIEM handcuffs Security Teams

Since AWS doesn’t provide a full-scale resource to analyze historical logs, a SIEM is necessary to gain complete visibility into an organization’s AWS stack. However, the scale of CloudTrail logs can be overwhelming for legacy SIEMs to ingest, leading to high operating costs and slow query performance. 

Diving into these pains a bit further, the cost of using SIEM can not be fully viewed by just the licensing cost of ingestion. The time spent operating a legacy SIEM creates a larger total cost of ownership (TCO). Leading to organizations spending more money on security than originally anticipated. This is typically attributed to scaling servers when ingestion needs to increase, hiring managed services to help run operations, or writing detections and storage costs for historical data retention. 

With a lack of scalability, legacy SIEM performance becomes another issue. Slow query times consistently lead to poor responsiveness. Leaving security teams behind the potential intruders. The average time to detect a breach is over 200 days. Although this number seems like a security team problem, it’s truly a scale problem. Legacy SIEM tools lack the ability to keep up with the overwhelming amount of data produced at cloud scale.

Ultimately, the only solution is a move away from legacy SIEM to a serverless modern SIEM that can provide ease and flexibility for security teams to adapt to the ever-changing AWS attack surface.

Optimizing Ingestion for CloudTrail

Panther ingests, normalizes, and structures your AWS CloudTrail as it’s being streamed in. As soon as the CloudTrail is logging into an S3 bucket, security teams can connect that bucket to Panther and logs will begin streaming in a few minutes after the integration is complete. Panther’s parsing engine then utilizes schema to structure and type the ingested data into JSON. The data is then normalized to a cleaner format. Below you can see the difference between an incoming raw log and a fully parsed log after it is run through Panther’s parser engine.

Graphic showing a raw HTTP request log
Graphic of the log parsed into a JSON object with standardized fields

Panther also enriches the incoming data with p_fields that can be seen in the parsed log above. These mark common indicators of compromise (IoC)’s and other helpful fields such as time stamp, when a log was created, log type associated, etc. These logs can be utilized with our indicator search feature to run IoC-based investigations and can be utilized with detections as well. 

Once ingested, detections are applied to the incoming log data. These rules are written in Python and can be applied out of the box or customized by a security team. Since these rules are applied as soon as logs are ingested and structured, an alert can potentially be generated within a few minutes of when the log event is received. Creating a real-time detection engine and response workflow for security teams. Whether an alert is generated or not, the data is replicated and transported to a security data lake on the backend, where it’s stored for 365-days by default.

Diagram showing complex AWS logs being parsed, normalized, analyzed and simultaneously being ran through detections in addition to being stored in the data lake

From start to finish, Panther’s ingestion process is the backbone of the product. It provides immediate structure and enrichment of incoming CloudTrail logs that can be leveraged for detecting, alerting, and investigating potential threats. Ingestion by Panther has optimized the following benefits: 

Benefits of Modern SIEM 

  1. Serverless Architecture – A modern SIEM like Panther utilizes serverless architecture to ingest logs from sources like CloudTrail. Panther is built on AWS Lambda’s that scale up and down automatically based on the needs of the moment. This eliminates the need for a security team to develop its own infrastructure and optimizes a more agile approach to ingestion. 
  1. High Performance and Storage Scalability – With the usage of a security data lake on the backend, Panther provides an easy way to store logs for 365 days of CloudTrail logs. By separating compute from storage, Panther is able to query historical AWS events extremely quickly and allows for correlation between AWS and non-AWS log sources across the technology stack. 
  1. Detection-as-Code – The ability to leverage developer-centric workflows, inside of Panther, to create, manage, test, and deploy detections decreases the time to create a new rule from weeks to days or even hours. 

Get Started! Try Panther

With the ability to get started with Panther in under an hour, security teams have immediate access to add coverage to their AWS environment with CloudTrail or any other common AWS log source. 

Try Panther for 30 days!