Threat Hunting in AWS

Calvin Kim

In recent years, threat hunting has become an essential component of infosec programs. Most threat hunting programs operate under the premise that the network is compromised, and a threat hunter is required to investigate suspicious behaviors and remove the bad actor from their environment. 

This blog will show you how to threat hunt in AWS log sources such as CloudTrail and VPC, but you can take these principles and apply them to any log source. Before you get started, you need to send logs of interest to your Security Information and Events Management (SIEM).

In this case, we will send AWS logs to Panther. Panther’s security data lake enables security teams to run faster queries across large data sets and investigate and hunt for anomalous activity within minutes. While these queries can be manually run at any time, deploying them as scheduled queries will allow you to search for suspicious activity in your environment continuously. 

Let’s review some examples of hunting through AWS CloudTrail and VPC Flow logs, which monitor your environment’s API activity and network traffic.

Example 1: Finding Compromised Credentials (TA0001:T1078.004)

AWS uses the concept of roles for authorization. These roles are “assumed” by entities such as instances or users and then granted access keys. If these keys are compromised, an attacker can access the AWS environment using the permissions of the compromised role. One way we may be able to identify compromised credentials is to look for roles that unusual user agents assume. For example, if your CI/CD pipeline uses the Java SDK to assume a deployment role, it would be unusual to see an AssumeRole event with a user agent from the boto3 Python SDK. This query can help identify those anomalies.

  count(DISTINCT userAgent)
  eventSource = ''
  and eventName = 'AssumeRole'
  and p_occurs_since('2 days')
  and userIdentity:principalId != 'null'
  and userAgent != 'AWS Internal'
  and requestParameters:roleArn != 'null'
  and p_occurs_since('3 days')
GROUP BY requestParameters:roleArn, userIdentity:principalId
HAVING count(DISTINCT userAgent) > 1
ORDER BY count(DISTINCT userAgent) DESCCode language: SQL (Structured Query Language) (sql)

While there may be cases where this is legitimate behavior in your environment, the roles discovered by this query would warrant further investigation.

Example 2: Cloud Service Surveillance (TA0007:T1526)

An attacker may perform surveillance to access services with compromised credentials. Attempts to access the services result in AccessDenied errors in CloudTrail. The following query can help identify credentials with more than five unique AccessDenied events, which may indicate failed attempt to use a role beyond its intended scope.

  count(DISTINCT eventName)
WHERE errorCode = 'AccessDenied'
  and p_occurs_since('1 day')
GROUP BY userIdentity:arn
HAVING count(DISTINCT eventName) > 5Code language: SQL (Structured Query Language) (sql)

Example 3: Port Scanning Activity (TA0007:T1046)

Network activity can be hard to aggregate due to the sheer volume these logs usually contain. Scheduled queries make network analysis more manageable and efficient. After gaining access to your network, an attacker may scan for exposed ports and services. The following query can identify cases where a single source communicates with multiple ports on a destination IP. While this can also be legitimate activity in your network, it may help identify an entity engaging in port scanning.

FROM panther_logs.public.aws_vpcflow
WHERE p_occurs_since('1 hour')
  and srcAddr != 'null'
  and srcPort not in (443, 80, 2049, 123, 445)
  and dstPort not in (443, 80, 2049, 123, 445)
  and flowDirection = 'egress'
GROUP BY srcAddr, dstAddr, vpcId, region, subNetId
ORDER BY COUNT(DISTINCT dstPort) DESCCode language: SQL (Structured Query Language) (sql)


We hope these queries can give you a starting point to dig deeper into your data and inspire you to create your queries. If you have a query you’d like to share, please submit a pull request to the panther-analysis repository.