A Technical Primer in Detection Engineering

Tools that an organization can use to detect threats are no longer a nice-to-have. Businesses are moving to the cloud, and the threat landscape is evolving and increasing in complexity. Today, threat detection is mission-critical. 

But manual threat detection processes can no longer keep up, and security teams must consistently address the challenges that threaten business objectives. The only way to overcome these hurdles and achieve threat detection at scale is by treating detections like software or detection-as-code.

Bad actors and adversaries are actively engineering their attacks and techniques to evolve, so it only makes sense for security teams to adapt. 

That’s where detection engineering comes into play. 

What is detection engineering? 

At its core, detection engineering functions within security operations and deals with the design, development, testing, and maintenance of threat detection logic. Threat detection logic is any rule, query, or tool used to detect activity that is either malicious, unexpected, or increases the risk that malicious activity will occur.

However, for an organization to truly embrace detection engineering, they must create a culture around it by going all-in on their threat detection process. This means having buy-in from all stakeholders: the security team, content developers, risk management, and threat hunters.

We’ve established that people and process are a critical part of detection engineering, but the threat detection platform that an organization chooses is equally crucial. When you leverage Detection-as-Code to build high-signal alerts, detection engineering shines. 

The benefits of hardening and testing detections

Security teams often include detection engineers responsible for creating, testing and tuning detections to alert the crew of malicious activity and minimize false positives. Testing and hardening your detections helps you optimize your detection rules to reduce alert fatigue and allow alerts to arrive with context and clarity.

When threat detection programs are fine-tuned to your specific environment and systems, your security team benefits. But that’s only achievable by treating detections as well-written code that can be tested, checked into source control, and code-reviewed by peers. This way, teams can produce higher-quality alerts that reduce fatigue and quickly flag suspicious activity.

1. Reduce alert fatigue

Nothing bogs down a security team like too many alerts, especially when they’re false positives. 

Scenarios like the following happen far too frequently. You’re scrolling through alerts, and each one you see appears no different than the next — it’s a false positive, but you keep scrolling on. Eventually, you get frustrated and start skimming through alerts quicker than usual. But what if you miss that one alert that may look like a false positive but is actually a potentially harmful threat? 

Alert fatigue is a serious symptom; hardening and testing your detections is an effective cure. 

Writing, testing and maintaining complex detections doesn’t need to be challenging and inefficient. When your security team can leverage a universal programming language like Python, you can write more sophisticated and tailored detections to fit the needs specific to your enterprise. These rules also tend to be more readable and easy to understand as the complexity increases.

Hardening and testing enable your security team to focus only on legitimate alerts. This way, your security team can focus on critical issues like remediation, especially when every second counts. 

2. Provide the proper alert context 

In addition to reducing alert fatigue, sound detection engineering can go a step further by providing context about the attack. With the right context, you’ll know who is attacking you, why they’re attacking, what they are capable of, and which of your systems or assets are vulnerable to compromise.

But gaining context is impossible without testing and hardening your detections. The more time you devote to evaluating the efficacy of those detections, the better your chances of the right alerts getting to the right person at the right time. 

Ultimately, your alerts will be actionable, understandable by decision-makers,  timely, and provide context.

3. The ability to leverage Test-Driven Development (TDD)

Test-driven development (TDD) is a software development process where test cases are written before the code that will make those tests run. The goal of TDD is to create self-testing code that can be executed quickly to identify and fix problems. This process effectively prevents errors, improves productivity, and reduces the amount of rework required after changes are made to a project.

By leveraging TDD, security teams can discover detection blind spots early in the process, test for false alerts and promote detection efficacy. Security teams that incorporate a TDD approach to detections put themselves in the mind of an attacker and document their line of thinking so that they have a list of insights into an attacker’s TTP (tools, tactics and procedures). 

Writing detections with TDD in mind improves the quality of detection code, resulting in more modular, extensible, and flexible detections — all without fear of breaking alerts or disrupting everyday operations.

4. Leveraging detection-as-code and CI/CD pipelines

Detection-as-Code (DaC) is a modern, flexible, and structured approach to writing detections that apply software engineering best practices to detection engineering. By adopting DaC, teams can build scalable processes for writing and hardening detections to identify sophisticated threats across rapidly expanding environments.

A Continuous Integration/Continuous Deployment (CI/CD) pipeline can be a key driver for security teams wanting to shift security left. When you use a CI/CD pipeline, you can easily enforce testing and linting checks. Plus, you’ll always have the most up-to-date version of the detection logic running in production.

CI/CD enables automated testing and delivery pipelines for your security detections. Instead of manually testing, deploying, and ensuring that the detections aren’t overly tuned – which could trigger false alerts – teams can stay agile by focusing on building fine-tuned detections.

The importance of Detection-as-Code  

Much like good application code, detections should also be treated as well-written code that can be tested, checked into source control, and code-reviewed by peers. In doing so, teams can produce higher-quality alerts that reduce fatigue and quickly flag suspicious activity.

Because every environment is unique, detections require a divergent set of techniques. Thus, detection engineers must create custom-tailored rules to adequately test, version, and programmatically manage version control. An expressive programming language like Python provides security teams with the flexibility and robustness to detect either advanced or straightforward behaviors in addition to context fetching, enriching, and telling the whole story of what happened.

And with a threat detection platform like Panther, your security team can leverage highly customizable Python-based detections, a built-in testing framework, and the ability to create detections directly in the UI or with a CLI-based workflow.

Looking at an example detection with Panther

With Panther, you can take detection engineering to a new level by crafting high-fidelity detections in Python using CI/CD workflows to power efficient alerting and response.

Panther offers reliable and resilient detections that empower you to:

  • Write expressive and flexible detections in Python for needs specific to your enterprise.
  • Structure and normalize logs into a strict schema that enables detections with Python and queries with SQL.
  • Perform real-time threat detection and power investigations against massive volumes of security data.
  • Benefit from 200+ pre-built detections mapped to specific threats, suspicious activity, and security frameworks like MITRE ATT&CK.

An Example Detection in Panther

First, a bit about rules and testing in Panther. Rules can be enabled and tested directly in the Panther UI, or modified and uploaded programmatically with the Panther Analysis tool, which enables you to test, package, and deploy detections via the command-line interface (CLI). And to assist with incident triage, Panther rules contain metadata such as severity, log types, unit tests, runbooks, and more.

For example, let’s say you want to create a rule which sends an alert when an admin panel is accessed on a web server. 

We’ll use the following NGINX log:

{
  "httpReferer": "https://domain1.com/?p=1",
  "httpUserAgent": "Chrome/80.0.3987.132 Safari/537.36",
  "remoteAddr": "180.76.15.143",
  "request": "GET /admin-panel/ HTTP/1.1",
  "status": 200,
  "time": "2019-02-06 00:00:38 +0000 UTC"
}Code language: JSON / JSON with Comments (json)

A basic rule in Panther would contain three define functions: rule, title and dedup

In our example, we will create: 

  1. A rule function that looks for 200 (OK) web requests to any URL with the admin-panel string.
    Return type: Boolean.
  2. A title to indicate that admin panel logins have been detected from a specific IP address.
    Return type: String.
  3. A dedup function to group all events by the same IP address.
    Return type: String.
def rule(event):
    return event.get('status') == 200 and 'admin-panel' in event.get('request')


def title(event):
    return f"Successful admin panel login detected from {event.get('remoteAddr')}"


def dedup(event):
    return event.get('remoteAddr')Code language: Python (python)

When the alert is triggered, here’s what will happen:

  • An alert would be generated and sent to the set of associated destinations, which by default are based on the rule severity
  • The alert would say, “Successful admin panel login detected from 180.76.15.143” 
  • Similar events with the same dedup string of 180.76.15.143 would be appended to the alert
  • The recipient of the alert could then check Panther to view all alert metadata, a summary of the events, and run SQL over all of the events to perform additional analysis 
  • A unique alert will be generated for each unique deduplication string, which, in this case, is the IP of the requestor. 

Learn more about how Panther enables testing here. You can also request a demo to see Panther in action and find out how our threat detection platform promotes robust detection engineering.

Recommended Resources

Escape Cloud Noise. Detect Security Signal.
Request a Demo