Panther acquires Datable to power the next generation of AI-driven security. Learn More

Product

Solutions

Integrations

Resources

Support

Company

Book a demo

Product

Solutions

Integrations

Resources

Support

Company

Book a demo

See all blogs

BLOG

5 Reasons Your Pipeline Is Broken‚ And How to Fix It

Bryan

Peace

Aug 7, 2024

TL;DR

Welcome to the 3rd and final part of our series on managing pipelines and infrastructure for high-fidelity detections. Part 1 outlined strategic considerations, and Part 2 drilled into detailed functional requirements. Here we wrap up with common pitfalls and how to avoid them. We also recommend checking out this on-demand webinar to see the full series in action.

Learning from Common Pipeline Mistakes

So youve reviewed strategic considerations for your project in part 1 and specific pipeline functionality and processes in part 2. Or maybe youre the type who skips right to the ending... Either way, learning all the ways pipelines can fail will help keep your project on the rails.

Whether youre just beginning to implement your strategy, or youre mid-implementation and starting to experience challenges, lets review how to avoid common pitfalls so your project stays on track.

#1 Youre Ingesting Too Much Data

False positives, alert storms, team fatigue, burnout... sound familiar? If you havent experienced this directly, youve probably heard stories from the trenches. These issues creep in when youre not being choosy enough about the data youre ingesting.

Data quality is a key prerequisite for high fidelity detections and alerts. Its a lot easier to ensure quality when the data set is focused on high value logs. You could have flawless detection rules with elegant logic perfectly aligned to your use cases, plus an army of expert threat hunters investigating your data. It wont matter if your data set is so large that detections and search queries take forever.

How to Fix It

So how do you address the needle in a haystack problem? Collect less hay.

In practical terms, this could mean simply not collecting logs generated by uncritical systems that have little to no security value. Or it could mean parsing high volume logs for noisy components, filtering them out, and preserving the high value components. You might focus on collecting only write and data transfer events (and filtering out the read events), rejected network traffic (and dropping expected, usual traffic), and logs from production environments (and filtering test and development).

Dont go overboard on the filtering though, or youll run into another pitfall...

# 2 Youre Not Ingesting Enough Data

The flip side of the above is limiting your visibility into potential threats by not ingesting enough. If your team has PTSD from alert storms creating an onslaught of false positives, bringing in very high volume logs like AWS CloudTrails, VPC Flow, or Guard Duty may give you pause. But these logs contain critical information that help pinpoint high-severity attacks.

Its not just about ingesting enough data, its about ingesting enough of the right data.

The goldilocks scenario of having just the right volumes from your most valuable logs can be achieved by designing pipeline functionality to support your use cases.

How to Fix It

So how do you get to this goldilocks scenario? If you read part 2, you already know. If you skipped ahead, well... you really should read it. Lucky for you, the relevant parts are right here:

Review and refine your threat models. Understand which attacks youre most susceptible to.
Prioritize your logs, with careful attention to their formats and volumes.
Implement pipeline functionality to filter noisy components of high volume, high value logs. You dont want to open up the floodgates, or youre back to #1.

# 3 Youre Skipping Testing and QA

If youre not validating that your security pipelines work as expected before theyre in production, youre risking failures that could limit visibility and severely weaken your security posture. Data getting routed to the wrong destination, potentially blowing up your budget... Transformation errors causing your detections to fail... Filters breaking down and flooding the system with noisy logs and alert storms... Not ideal.

How to Fix It

Testing and QA are the security engineering equivalents of never skipping leg day. Theyre not fun or sexy but pay consistent dividends throughout your entire SecOps program.

So get that squat rack ready: stand up a non-production environment to test your pipeline functionality works as designed. Send a variety of sample logs through and validate your routing, filters, and transformations are functioning properly. See how they impact your detection efficacy and alert fidelity. When something breaks, conduct root cause analyses and fix the underlying issues.

Theres detection-as-code, infrastructure-as-code... why not ingestion-as-code too? Treat your pipelines like high-performance software, and carry that over to all other aspects of your SecOps program while youre at it.

#4 Youre Wasting Compute and Storage

If youre using the same security infrastructure across all logs, regardless of their relative value, youre likely spending way more on compute and storage than you actually need. Without prioritizing logs and diverting different categories for separate use cases, youll end up not tapping into all the computing power and storage youve purchased.

Luckily this pitfall is less prevalent with teams adopting more flexible strategies. However, its still a symptom of the monolithic SIEM philosophy that encourages kitchen-sink log ingestion and one-size-fits-all approaches to compute and storage.

How to Fix It

First step: break down your tech stack into modular components. Then decouple the processes and supporting technical functionality required for data collection, analytical processing, and storage.

Revisit your threat models and log priorities and categorize your logs into high, moderate, and low value buckets. These categories align to real-time analysis in hot storage, near-term correlation analysis in warm storage, and historical analysis and compliance reporting in cold storage. Operationalize these 3 workflows using a combination of routing, filtering, and transformation processes based on your log formats and detection logic.

#5 Youre Not Taking Data Refinement Seriously

So you dont skip leg day anymore, great. But after a few months you might wonder, are you mixing it up enough to challenge yourself? Not just the standard back squats and deadlifts, but front, box, split, and overhead squats, plus sumo, stiff leg, and trap bar deadlifts?

Think of all these variations like more advanced data transformations and refinement techniques. They could be the missing piece that tightly aligns your pipeline to your detection strategy and takes your SecOps strategy from good to great.

How to Fix It

Revisit your priority logs (yes, again), and this time take a good hard look at their native formats. Then flip over to your threat model and detection rules. Now you have the necessary context to think through all the ways you can transform those native logs into standardized formats that are perfectly optimized for your detections. There are hundreds, maybe thousands of ways transformations can be combined with different log sources to deliver maximum fidelity detections and alerts.

Bonus points for reviewing your threat model (I know, again) and seeing if your logs are missing key pieces of information that will unlock even more security insights. From there you can look into enrichment providers to fill in the gaps.

Wrapping Up

This series has covered a lot of ground on pipeline strategy and functional considerations. If theres one takeaway, its this: high-performance SecOps programs start with data quality.

Its simple in theory, yet two decades of practical experience have proven that achieving quality datasets from high performance pipelines is deceptively complex. Armed with a thoughtful strategy that aligns pipeline functionality to your threat model and detections, success is well within your reach.

RESOURCES

Recommended Resources

desktop

Blog

Infoblox Tunes Detections 70% Faster with Panther AI

Katie Campisi

Product Marketing at Panther

desktop

Blog

Infoblox Tunes Detections 70% Faster with Panther AI

Katie Campisi

Product Marketing at Panther

desktop

Blog

Infoblox Tunes Detections 70% Faster with Panther AI

Katie Campisi

Product Marketing at Panther

desktop

Blog

Infoblox Tunes Detections 70% Faster with Panther AI

Katie Campisi

Product Marketing at Panther

desktop

Case Study

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

Katie Campisi

Product Marketing at Panther

desktop

Case Study

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

Katie Campisi

Product Marketing at Panther

desktop

Case Study

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

Katie Campisi

Product Marketing at Panther

desktop

Case Study

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

Katie Campisi

Product Marketing at Panther

desktop

Blog

Join the SIEM Revolution: AI-Ready Security That Scales

William Lowe

CEO

desktop

Blog

Join the SIEM Revolution: AI-Ready Security That Scales

William Lowe

CEO

desktop

Blog

Join the SIEM Revolution: AI-Ready Security That Scales

William Lowe

CEO

desktop

Blog

Join the SIEM Revolution: AI-Ready Security That Scales

William Lowe

CEO

Ready for less noise
and more control?

See Panther in action. Book a demo today.

Book a demo

Read the docs

Get product updates, webinars, and news

By submitting this form, you acknowledge and agree that Panther will process your personal information in accordance with the Privacy Policy.

Get product updates, webinars, and news

By submitting this form, you acknowledge and agree that Panther will process your personal information in accordance with the Privacy Policy.

Product

Resources

Support

Company

Get product updates, webinars, and news

By submitting this form, you acknowledge and agree that Panther will process your personal information in accordance with the Privacy Policy.

5 Reasons Your Pipeline Is Broken‚ And How to Fix It

TL;DR

Learning from Common Pipeline Mistakes

#1 Youre Ingesting Too Much Data

How to Fix It

# 2 Youre Not Ingesting Enough Data

How to Fix It

# 3 Youre Skipping Testing and QA

How to Fix It

#4 Youre Wasting Compute and Storage

How to Fix It

#5 Youre Not Taking Data Refinement Seriously

How to Fix It

Wrapping Up

Recommended Resources

Infoblox Tunes Detections 70% Faster with Panther AI

Infoblox Tunes Detections 70% Faster with Panther AI

Infoblox Tunes Detections 70% Faster with Panther AI

Infoblox Tunes Detections 70% Faster with Panther AI

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

How Snyk Increased Infrastructure Coverage and Reduced Alerts with Panther

Join the SIEM Revolution: AI-Ready Security That Scales

Join the SIEM Revolution: AI-Ready Security That Scales

Join the SIEM Revolution: AI-Ready Security That Scales

Join the SIEM Revolution: AI-Ready Security That Scales

Ready for less noiseand more control?

Product

Resources

Support

Company

Ready for less noise
and more control?