Shifting SIEM Left: Securing the Software Supply Chain with GitHub Monitoring

I have worked with several SIEMs over the years, and when I came to Panther, I noticed something very different concerning the data sources our customers ingest. It’s not just your traditional data sources such as firewalls, Windows audit logs, and IDS/IPS, but data sources more commonly associated with Cloud and DevOps such as Okta, GitHub, AWS Cloudtrail, Slack, Zoom, and custom application log sources. I went back to my notes on some of the top data sources when working with legacy SIEMs and compared them to data sources that organizations bring into Panther, and the difference was pretty clear. 

Legacy SIEM 
– Firewall 
– Network IPS/IDS
– Endpoint/Windows Audit Logs
– Authentication (AD)
– VPN/Proxy
– Active Directory
– DNS
– DHCP
– DLP
Panther/Cloud SIEM
– AWS/GCP/Azure
– Authentication (Okta)
– GitHub
– Custom App Logs
– GSuite/M365
– Slack
– Password Managers
– Zoom
– Email

This shift has been occurring even more rapidly over the past few years, particularly with the COVID-19 pandemic forcing some companies to a cloud-first approach whether they liked it or not. The data sources will evolve as more data and workloads get pushed to the Cloud. Organizations no longer manage workstations within traditional on-prem networks but also a remote workforce, cloud workloads, and custom applications, increasing SIEM deployments’ data volume and complexity. 

Threat actors have followed this shift by retooling their techniques to target developers and upstream infrastructure. For example, the Lazarus APT Group operates out of North Korea, where they have been ramping up their capabilities targeting gaming, cryptocurrency, and cybersecurity companies. In a recent security alert, GitHub did a great job detailing one of Lazarus’ latest campaigns, including IoCs and more details on the specific attack methodologies targeting developers leveraging social engineering techniques to trick them into executing malicious code on their local systems. 

Ingesting GitHub logs into your SIEM may seem strange, but given the shift to the Cloud and the threat actors following suit, it makes sense to monitor our source code repository. To many organizations, their code is their crown jewel and can also pose potential threats downstream if the code is compromised and malicious code injected.

Ingesting GitHub Logs Into Panther

In Panther, integrating GitHub is quite simple. The ability to export logs from GitHub requires a GitHub Enterprise license. GitHub will push logs into Panther by creating an API key with read permissions. You can configure access via Oauth2 Authorization Flow or a Personal Access Token; in my lab environment, I used a Personal Access Token for simplicity.

Once I had GitHub logs flowing into my environment, I enabled the Panther GitHub Pack, including around 20 out-of-the-box detections. The entire process of onboarding GitHub data and allowing the pack to take less than 15 minutes provides real-time detections for critical events and a Security Data Lake with all log events generated by GitHub with a one-year retention of searchable data. 

Highway to the Danger Zone

Poking around GitHub, I started to explore potential attack vectors. GitHub has conveniently highlighted a few actions in the “Danger Zone”  that you will definitely want to be alerted to if triggered. For example, “Disabling branch protection rules” are a set of settings and restrictions you can apply to a specific branch in a repository to control who can change that branch and under what conditions. These rules are often used to ensure code quality, maintain a clean and stable main or master branch, and prevent unauthorized or accidental changes.

If “Disable branch protection rules”  is disabled, we will want to trigger a critical severity alert; luckily, this is one of the detections Panther provides out-of-the-box, so we already have most of these covered when we enabled the pack.

In addition to applying real-time detections that Panther provides out of the box, Panther makes it easy to write new detections. One thing I wanted to be able to do is monitor when people are invited and then added to a repository, as well as if someone is escalated to an admin with increasing levels of severity. First, I invited a user, accepted the invite, and then escalated the user’s privileges to generate logs that I can then view in Panther’s Security Data Lake via Search, which I will then use to develop a hypothesis for new detections as well as copy log entries for my unit tests.

Looking at the event actions for the log events, I see that the user events start with an “org” prefix followed by the action, so we have org.invite_member, org.add_member, and org.update_member and we want to assign increasing severity so we will use the Panther severity() function for this. I use a dictionary object in Python to store the actions and associated severity I wish to apply. I use the required Panther rule() function to loop through the “actions” field to see if it matches any of the specific actions I want to trigger my detection on; then, using the Panther severity() function, I assign the severity from the dictionary object. I think output the title with specific information regarding the user that created the action and the user account affected by the changes. 

gh_actions = {"org.invite_member": "LOW", 
              "org.add_member": "MEDIUM", 
              "org.update_member": "HIGH"}

def rule(event):
   if event.get("action") in gh_actions.keys():
        return True

def severity(event):
    return gh_actions.get(event.get("action"), "INFO")

def title(event):
        return (
        f"A [{event.get('action', '<UNKNOWN_ACTION>')}] action was created by [{event.get('user', '<UNKNOWN_USER>')}] "
        f"by [{event.get('actor', '<UNKNOWN_ACTOR>')}]"
        f" for org [{event.get('org', '<UNKNOWN_ORG>')}]"
    )
Code language: Python (python)

I then copy the JSON for all three events from my Security Data Lake as unit tests for my detection and an additional unit test I expect to fail for negative matches. 

When I run all of my unit tests, I can see that all of them are working correctly, with the appropriate severity applied to each potentially increasing criticality along with my negative unit test. 

Conclusion

Although GitHub is not a traditional data source for SIEM, we outlined why modern organizations should consider ingesting these logs to provide visibility and monitor their development environments. Panther can quickly ingest this data with GitHub Enterprise and apply out-of-the-box detections in minutes; then, we can use our Security Data Lake with a full year of searchable data to develop hypotheses for new detections, as well as use the log events as unit tests for us to test our new detections using real data. If you want to learn more about how to write detections in Panther and detection-as-code concepts, we have several hands-on workshops available where we will get you up and running with Panther in just a few hours. I hope to see you in one soon! 

Recommended Resources

Escape Cloud Noise. Detect Security Signal.
Request a Demo