NEW

Panther joins Databricks to build the future of the security lakehouse. Read more →

Platform

Solutions

Resources

Company

Book a demo

Platform

Solutions

Resources

Company

Book a demo

Panther joins Databricks to build the future of the security lakehouse. Read more →

See all blogs

BLOG

What Is Shadow AI? Why Security Teams Need to See It Early

Michelle

Dufty

Jun 14, 2026

A developer pastes a function into ChatGPT to debug it. A product manager uploads a customer call transcript to summarize action items. A marketing analyst feeds an unreleased pricing strategy into Claude for competitive analysis. None of it touches your sanctioned tools. None of it shows up in your DLP dashboard. All of it is happening right now.

That's shadow AI: any AI tool your employees use for work that hasn't gone through IT approval, from consumer chatbots to code assistants to LLMs accessed outside sanctioned channels. The distinction that matters is between AI that's been through testing and evaluation before enterprise adoption and AI that hasn't. And the scale is bigger than most SOC teams realize.

About 45% of employees are regular AI users on corporate devices, and 67% of those users are accessing AI platforms through personal accounts your security team can't see. Organizations with high shadow AI exposure paid roughly $670,000 more per breach in 2025 than those with low or no shadow AI usage. 97% of organizations that experienced an AI-related security incident lacked proper AI access controls, and 63% had no AI governance policies at all.

This article covers how shadow AI enters real environments, the telemetry where it leaves traces, the detection patterns that catch it early, and a response framework that lean SOC teams can actually run.

Key Takeaways:

Shadow AI is the unauthorized use of AI tools for work tasks without IT approval. It creates risks that traditional shadow IT controls don't fully address.
Shadow AI can enter organizations through personal AI accounts, OAuth-connected apps, browser extensions and IDE copilots, hardcoded API keys, and autonomous AI agents.
Detection usually requires correlating DNS/proxy logs, identity provider and OAuth logs, SaaS audit logs, and endpoint data. No single tool gives you the full picture.
A practical response framework moves from discovery through triage, containment, governance, education, and continuous monitoring, using a Security Data Lake and AI-assisted triage to keep lean SOC teams focused.

Shadow AI vs. shadow IT: what's actually different this time

Shadow AI and shadow IT create different kinds of exposure. Shadow IT stores your data somewhere unauthorized. Shadow AI processes that data, can retain it for model training, and may reproduce sensitive patterns in future sessions for other users. An AI model can generate outputs from proprietary inputs that surface in another user's session weeks later. The exposure can continue after the employee closes the browser tab.

Shadow AI also transmits data as conversational HTTPS streams to domains like chat.openai.com that are typically whitelisted. Because much enterprise AI usage relies on OpenAI, Google, and Microsoft services, domain blocking could also disrupt sanctioned enterprise AI.

AI outputs vary with each invocation and shift when providers update model weights, which creates audit gaps traditional shadow IT frameworks were never built for. AI agents can operate autonomously across multiple systems, unlike traditional shadow IT tools, which are typically just unauthorized apps or services used without IT approval.

How shadow AI shows up inside an organization

Shadow AI enters organizations through multiple paths, and each one leaves different evidence behind. Here are the most common operational categories, mapped to the telemetry and controls you actually have.

Sanctioned tools used in unsanctioned ways

67% of AI platform users on corporate devices access those platforms through personal, non-corporate accounts. Without tenant restriction controls or inline SSL inspection, your web proxy can't tell enterprise-managed sessions from personal ones. The rate of data policy violations associated with GenAI app usage doubled in 2025, with the average organization experiencing 223 such incidents per month.

Personal accounts and consumer AI services for work tasks

Personal AI accounts used for work tasks are the most common shadow AI vector. Samsung is the textbook case. Within 20 days of allowing engineers to use ChatGPT, three exposure events followed. Engineers pasted confidential semiconductor source code and uploaded a full internal meeting transcript. Samsung banned external AI tools company-wide after the incidents, but the data had already been processed.

Traditional monitoring tools built around file transfer patterns may miss this kind of conversational AI data flow.

Browser extensions, IDE plug-ins, and embedded copilots

AI coding assistants operate with deep access to local development environments. AI-assisted commits expose secrets at 3.2% versus 1.5% for human-only commits. Browser extensions carry their own risk: malicious actors push extension malware, bypassing any initial security review.

API keys, hardcoded models, and autonomous agents in cloud environments

AI-service credentials (API keys for LLM providers, vector databases, embedding services) increased 81%. Prompt injection is a major risk for LLM systems. AI agents that operate with their own credentials, execute multi-step tasks, and take irreversible actions create a security footprint that conventional tools were not built to address.

The security risks SOC teams actually have to deal with

Shadow AI creates a small set of recurring risks that show up in day-to-day security operations. Prioritize them by how they affect detection, investigation, and response.

Sensitive data leaving the perimeter through prompts

The most common shadow AI risk is employees pasting sensitive data into AI prompts. Around 39.7% of all AI interactions involve sensitive data, and the most common data type submitted to external GenAI models is source code.

New identity attack surface and OAuth exposure

OAuth tokens granted to AI integrations operate outside network-layer visibility. The Salesloft/Drift OAuth supply chain attack in August 2025 exposed over 700 organizations through stolen OAuth tokens. The enterprise's own systems were never the attack vector.

Compliance gaps that surface during audits

Shadow AI creates auditable gaps across multiple frameworks. For SOC 2, AI tools on personal accounts can sit outside enterprise access controls and monitoring workflows. For GDPR, unapproved AI use can fall outside formal governance and inventory processes, which makes assessments harder to complete.

For HIPAA, only 31% of healthcare organizations are actively monitoring AI systems.

Business decisions made on hallucinated or biased output

Shadow AI output used in business decisions creates direct liability. In Moffatt v. Air Canada (2024), a tribunal held Air Canada liable for its chatbot's incorrect bereavement fare guidance. Organizations using AI output for security assessments, compliance reports, or regulatory filings face downstream liability when that output contains hallucinated content.

Where shadow AI shows up in your telemetry

You need multiple log sources to detect shadow AI reliably. Here's where the strongest signals appear, and where each source still leaves blind spots.

DNS, proxy, and egress logs

DNS resolver logs and proxy logs are the foundational detection layer. Covering OpenAI, Google, and Microsoft endpoints addresses a substantial share of common AI usage. Connections to API endpoints (like api.openai.com) are higher-priority signals than browser visits to consumer products, because they indicate programmatic access.

Identity provider and OAuth grant logs

OAuth grants are visible in identity provider audit logs (Google Workspace Admin, Azure Entra ID), but only if you're actively monitoring third-party app authorizations. Flag applications granted Mail.ReadWrite, Files.ReadWrite.All, or offline_access scopes. The cloud-to-cloud data access that follows is rarely forwarded to a SIEM or correlated with the original OAuth consent.

SaaS audit logs and CASB data

CASBs discover which AI SaaS applications employees access and enforce DLP policies. They miss local models running on endpoints and cannot inspect encrypted API calls from some AI tools.

Endpoint inventory and browser extension data

Some local AI tooling can reduce or eliminate the network signals that DNS/proxy monitoring depends on, which makes endpoint visibility important. Monitor endpoint process activity for known local AI executables where that telemetry is available. For browser extensions, EDR data from CrowdStrike FDR, Intune, or Jamf can flag installations of AI-related extensions not on your approved list.

Detection patterns that catch shadow AI early

The most useful shadow AI detection rules focus on repeatable behaviors rather than one-off tool names. The core patterns below help you catch common usage early and keep coverage current as AI apps change.

Domain and endpoint matching for known AI services

Start with a watchlist. A SQL query against DNS resolver logs matching destinations like api.openai.com, api.anthropic.com, generativelanguage.googleapis.com, and api.mistral.ai catches the majority of shadow AI traffic. Community-maintained Sigma rules for AI tool detection, like those at github.com/agentshield-ai/sigma-ai, give you a starting point you can adapt.

Anomalous data transfer volumes to AI endpoints

Per-session thresholds miss slow, cumulative data exposure. An employee who pastes excerpts from a strategic document into a personal AI session once a day for a month won't trigger per-session alerts, even though cumulative exposure is significant. Cumulative exposure logic needs daily and weekly aggregation windows alongside per-session thresholds, baselined against each user's historical pattern.

OAuth scope and consent monitoring

Monitor ConsentToApplication events in your identity provider's audit logs, flagging OAuth grants to known AI applications with broad scopes. Token lifecycle monitoring matters just as much. A user approves an application from a trusted device today. Six months later, that same token can operate from a compromised endpoint on an untrusted network, because nothing reevaluated the grant.

Codifying shadow AI detection rules as code so coverage keeps up

AI apps grew to over 3,400 in a single year, nearly four times more than the previous year. Writing shadow AI detection rules in code (Python or YAML), version-controlling them, and deploying through CI/CD pipelines means your watchlist updates and rule logic travel through the same tested pipeline.

As Stephen Gubenia, Head of Detection Engineering for Threat Response at Cisco Meraki, puts it, "AI isn't the silver bullet; you still have to have processes in place, good logging and alerting pipelines, sound detection logic."

Panther supports detection-as-code: rules authored in Python or YAML with unit tests and CI/CD integration, so coverage updates deploy like any other code change. For teams where not every analyst writes Python, Panther's Simple Detection Builder provides a no-code path that still plugs into the same version-controlled pipeline.

From detection to response: what to do once you find it

Make AI experimentation visible. A blanket ban pushes usage further underground.

A practical response framework follows six phases:

Discover (proxy/DNS analysis, SaaS OAuth audit, AI asset inventory)
Triage (severity tiering by data classification)
Contain (block tool access, preserve evidence)
Govern (publish an AI acceptable use policy with approved tools and a clear approval pathway)
Educate (role-based training with sanctioned alternatives)
Monitor (continuous telemetry and behavioral baselining).

As Mike Hanley, CSO at GitHub, says, "It's important to socialize risks and have conversations about the right ways to achieve a particular standard." The NIST AI Risk Management Framework provides a governance backbone for AI risk management across these phases.

Severity tiering helps keep a lean team focused. The goal of incident response is to reduce the number and impact of incidents and improve the efficiency and effectiveness of detection, response, and recovery. High-risk cases involving sensitive customer or regulated data call for containment, review, and escalation. Lower-risk internal policy violations call for enforcement and user education.

Bringing shadow AI into the light with a Security Data Lake and AI SOC analyst

Shadow AI telemetry comes from multiple sources. A complete picture requires correlation across web proxy, DNS, DLP, OAuth, SaaS audit, endpoint, and cloud egress logs. Panther's Security Data Lake gives you the centralized store to correlate across all of them, and ownership of that data means you're not locked into a single vendor's query language or retention limits.

Docker's security team faced a similar multi-source visibility challenge. By centralizing telemetry and applying Python-based detection rules with correlation logic, they cut false positives by 85% while tripling ingestion. The same detection-engineering principles apply directly to shadow AI monitoring.

A lean SOC team may not be able to manually triage the alert volume that shadow AI monitoring generates, especially when the volume of prompts sent to GenAI services increased by 500% over the past year and enterprise AI apps grew 4x.

Panther's AI SOC analyst handles this volume directly. It's an autonomous alert triage agent within Panther AI that reviews every alert, builds context by pulling enrichments and writing pivot queries, and escalates only the ones that are genuinely risky or inconclusive. Analysts get a distilled summary with risk classification scores, follow-up recommendations, and visible reasoning, so they retain final judgment on the strategic calls. The agent handles the repetitive classification work that would otherwise pull a lean team off threat hunting.

Shadow AI isn't slowing down. The teams that stay ahead of it treat detection as code, correlate telemetry from every source that touches AI usage, and let AI-assisted triage carry the repetitive load so analysts can focus on the alerts that actually matter.