The LiteLLM Supply Chain Attack and Why Your Secrets Shouldn't Survive Boot

Yesterday, March 24, 2026, a threat actor group calling themselves TeamPCP published two malicious versions of LiteLLM to PyPI. If you're running LiteLLM in production (and a lot of people are, since it's the most popular LLM proxy gateway in the Python ecosystem), you need to understand what happened, what it took, and why the architecture decisions you made six months ago just became the most important factor in your incident response.

In this post I'll break down the attack, then walk through the strategies I use to limit the damage from exactly this class of compromise: ephemeral filesystem secrets via the Kubernetes Secrets Store CSI Driver, the case against environment variables as secret storage, honeypot credentials for detection, and network-level blast radius containment.

What Actually Happened

Already know the details? Skip to the analysis.

TeamPCP didn't hack LiteLLM directly. They compromised the supply chain upstream of LiteLLM by first poisoning Aqua Security's Trivy scanner, a widely trusted open-source vulnerability scanning tool, back on March 1st. LiteLLM's CI/CD pipeline installed Trivy without version pinning:

ci_cd/security_scans.sh
# This is what got them owned
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh
trivy fs --exit-code 1 .

The poisoned Trivy binary ran during CI, harvested the pipeline's environment variables (including PyPI upload credentials), and exfiltrated them. On March 23rd, the attackers registered the lookalike domain litellm.cloud (the legitimate site is litellm.ai). By 08:30 UTC on March 24th, two malicious packages were live on PyPI:

v1.82.7 - Payload injected into litellm/proxy/proxy_server.py, triggered on importing litellm.proxy
v1.82.8 - Payload delivered via litellm_init.pth, triggered on any Python startup. You didn't even need to import litellm. If it was installed in the environment, it ran.

Neither version existed on GitHub. The GitHub releases only reached v1.82.6.dev1. This was a pure PyPI-only attack.

The Payload: A Credential Vacuum

The malicious code was a three-stage credential harvester. It vacuumed up everything it could find on the filesystem:

SSH keys (~/.ssh/id_rsa, id_ed25519, config)
AWS credentials (~/.aws/credentials, ~/.aws/config)
GCP and Azure tokens
Docker configs
Kubernetes service account tokens and kubeconfig files
.env files from every common application directory (/home, /root, /opt, /srv, /var/www, /app, /data)
credentials.json, secrets.json, service account key files
/etc/shadow and SSL private keys
Shell history, git configs, npm tokens, PyPI tokens
Cryptocurrency wallet data

Everything was encrypted with a random AES-256 session key, the session key encrypted with a hardcoded RSA-4096 public key, packaged into tpcp.tar.gz, and POST'd to https://models.litellm.cloud/.

The more sophisticated v1.82.8 variant went further: it deployed a Kubernetes lateral movement toolkit that spawned privileged pods across every cluster node, and installed a persistent systemd backdoor polling an external C2 for additional binaries.

Impact

If you installed litellm v1.82.7 or v1.82.8 from PyPI between 08:30 UTC and ~14:00 UTC on March 24, 2026, assume every credential accessible from that environment is compromised. PyPI has quarantined the entire litellm package. Google Mandiant has been engaged.

The Pattern We Keep Ignoring

This is the same pattern we see over and over. The attacker gets code execution (supply chain compromise, RCE, SSRF, deserialization bug) and the first thing they do is scrape the filesystem and environment for secrets. It's the lowest-hanging fruit in the post-exploitation playbook because we keep putting secrets in plaintext where any process can read them.

The LiteLLM payload explicitly targeted:

~/.aws/credentials - long-lived IAM access keys sitting in INI files
.env files - the universal "dump everything here" pattern
Environment variables via printenv - the payload ran a raw printenv and captured the entire output. If you're injecting secrets as environment variables (the approach Kubernetes recommends by default with secretKeyRef, and the approach most 12-factor apps use), every one of those secrets was captured by a single shell command.
Kubernetes Secrets - base64-encoded (not encrypted) in etcd by default
Service account key files - JSON files with permanent credentials

Environment variables deserve special attention here because they're the mechanism most operators reach for first. The Twelve-Factor App methodology popularized ENV as the canonical way to inject configuration, and Kubernetes' envFrom makes it trivially easy. But environment variables are visible to every process in the container, trivially dumped by any subprocess via printenv or /proc/<pid>/environ, and frequently leaked into logs, crash dumps, and error reporters. Stop treating them as secret storage.

Every one of these (flat files, env vars, base64-encoded K8s Secrets) is readable by any process with basic access. The attack doesn't need to be sophisticated. It just needs cat and printenv.

Ephemeral Secrets: Making the Attacker's Job Harder

This is the problem I built bitwarden-csi-provider to solve. It's an open-source Rust implementation of the Kubernetes Secrets Store CSI Driver spec that I wrote and maintain. It pulls secrets from Bitwarden Secrets Manager directly into pods as tmpfs-mounted files. The critical difference from standard Kubernetes Secrets:

Secrets never touch etcd. They're fetched from Bitwarden at pod initialization and mounted directly into the pod's tmpfs filesystem. There's no Kubernetes Secret object to kubectl get secret -o json.
Secrets exist only in volatile memory. tmpfs lives in RAM. When the pod dies, the secrets vanish. There's no PersistentVolume with your API keys sitting on a disk somewhere.
The application can purge the file after loading. This is the key architectural decision. Your application reads the secret into memory at startup, then the file is deleted from the tmpfs mount. The window during which the secret is readable from the filesystem is measured in milliseconds.

SecretProviderClass
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: my-app-secrets
spec:
  provider: bitwarden
  parameters:
    secrets: |
      - id: "bf3a92e1-4c67-4d0a-9f8e-1a2b3c4d5e6f"
        path: "api-key"
      - id: "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
        path: "database-url"

Pod spec
volumes:
  - name: secrets
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: "my-app-secrets"
      nodePublishSecretRef:
        name: bsm-access-token

At pod init, the CSI driver calls my provider via gRPC. The provider authenticates to Bitwarden Secrets Manager, fetches the requested secrets, and writes them to the tmpfs mount. The application reads the files directly into its own memory at startup, then the files are purged:

entrypoint.sh
#!/bin/sh
# Start the application - it reads secrets from /mnt/secrets at init
# then purge the files from the filesystem
exec "$@" &
APP_PID=$!

# Give the app time to read secrets into memory, then destroy the files
sleep 2
rm -f /mnt/secrets/*
wait $APP_PID

Don't export secrets to ENV

You might be tempted to export API_KEY=$(cat /mnt/secrets/api-key) in your entrypoint. Don't. As we just discussed, the LiteLLM payload ran printenv and captured every environment variable in a single command. Environment variables are trivially readable by any child process, leaked into crash dumps, and visible via /proc/<pid>/environ. The entire point of ephemeral file-based secrets is to keep them out of the environment. Your application should read the file directly into its own heap memory and hold it there.

After that rm, there's nothing on the filesystem to steal. The secrets exist only in the application's process memory. An attacker would need to attach a debugger or read /proc/<pid>/mem, both of which are preventable with seccomp profiles and ptrace restrictions, and both significantly noisier than running cat on a flat file.

What This Stops (and What It Doesn't)

Let's be precise about the threat model. Against the LiteLLM payload specifically, this architecture eliminates:

Filesystem credential scraping - no .env files, no credentials.json, no ~/.aws/credentials to harvest. The payload's find / -name "*.env" returns nothing.
Kubernetes Secret exfiltration - no K8s Secret objects to query from a compromised service account, because secrets never materialized as K8s resources.
Persistent credential exposure - if the pod restarts, new credentials are fetched. If the old credentials were compromised, rotating them in Bitwarden immediately invalidates any stolen copies.

What it doesn't stop:

Memory scraping - an attacker with sufficient privileges can still read process memory via /proc/<pid>/mem or by attaching a debugger. This requires elevated access and is significantly noisier than reading flat files. Seccomp profiles and ptrace restrictions can block this entirely.
Network-level exfiltration - if your application makes API calls with the loaded credentials, an attacker who controls the routing or DNS could intercept them in transit. mTLS and egress policies mitigate this.

It's worth noting what the LiteLLM payload actually did: it spawned external subprocesses (subprocess.Popen) that ran shell commands (printenv, cat, find, curl, openssl) to scrape the filesystem and environment. It did not perform in-process Python interception, monkeypatch os.environ, or hook API client calls at the runtime level. This was a brute-force filesystem and environment vacuum, not a sophisticated runtime attack. That distinction matters: ephemeral file-based secrets with no env var leakage directly defeats this specific payload's collection mechanism.

This is not a silver bullet. Nothing is. But it dramatically reduces the initial blast radius. Instead of every credential on the system being harvestable by printenv and a recursive find, the attacker would need to read a specific process's heap memory during a specific window. That's a fundamentally harder problem, and a fundamentally more detectable one.

Defense in Depth: SIEM, Honeypots, and Detection

The combination I run:

Honeypot credentials

Plant fake .env files, ~/.aws/credentials, and credentials.json files with honeypot credentials that alert on use. These are credentials that look real but point to monitored endpoints. When the LiteLLM payload (or any credential harvester) scrapes them and tries to use them, you get an immediate alert.

~/.aws/credentials (honeypot)
[default]
aws_access_key_id = AKIA5XXXXXXXXXDETECT
aws_secret_access_key = wJalrXXXXXXXXXXXXXXX/bPxRfiHONEYPOT+KEY

The technique is straightforward: create a dummy IAM user with no permissions, generate access keys for it, and plant those keys in your honeypot files. Then set up a CloudTrail metric filter and CloudWatch alarm that fires on any API call authenticated with that user's credentials. When a credential harvester like the LiteLLM payload scrapes your fake ~/.aws/credentials and tries to use them, you get an immediate alert. The setup takes fifteen minutes, costs nothing, and gives you a high-fidelity tripwire with virtually zero false positives. If you're not doing this, start today.

SIEM alerting on secret access patterns

Monitor for processes reading credential files that shouldn't be reading them. auditd rules on sensitive paths, Falco rules for unexpected file access in containers, and network egress alerts for connections to unknown endpoints. The LiteLLM payload made outbound HTTPS connections to litellm.cloud. Egress filtering and DNS monitoring would have flagged this immediately.

Network policies as blast radius containment

Kubernetes NetworkPolicies should restrict egress to known endpoints. If your LLM proxy pod only needs to talk to OpenAI's API, it shouldn't be able to POST encrypted tarballs to arbitrary domains. This is basic hygiene that most clusters still don't implement.

The Bigger Picture

The LiteLLM attack is a case study in how modern supply chain compromises actually work. It wasn't a sophisticated zero-day. It was:

Compromise an upstream dependency (Trivy)
Harvest CI/CD credentials from the target project
Publish malicious packages to the package registry
Scrape every secret off the filesystem of every machine that installs your package

Step 4 is where the damage happens, and it's where your architecture decisions determine whether the attacker gets your production database credentials or an empty directory.

The supply chain problem isn't going away. Package registries are high-value targets precisely because they're trusted implicitly by CI/CD pipelines and production deployments. You can't prevent every compromise upstream. What you can control is how much damage a compromised dependency can do once it's running in your environment.

Ephemeral secrets aren't exotic. The Secrets Store CSI Driver is a CNCF project. Bitwarden Secrets Manager is free for small teams. I wrote bitwarden-csi-provider because this tooling should exist and be open source. It deploys as a DaemonSet and the integration takes an afternoon. The question is whether you'll adopt it before or after your credentials end up in a tarball on someone else's server.

Kevin Crawley

Infrastructure engineer, crawley.systems. 20+ years building and operating production systems.