The LiteLLM supply chain attack: what every AI engineer should learn from it

TL;DRLiteLLM, the most popular open-source LLM proxy, was compromised via a poisoned security scanner in its CI/CD pipeline. Here is what happened, why it matters, and what it

TL;DR: On March 24, 2026, attackers published malicious versions of LiteLLM (v1.82.7 and v1.82.8) to PyPI that harvested cloud credentials, SSH keys, Kubernetes tokens, and crypto wallets. The attack originated from a compromised Trivy security scanner in LiteLLM’s own CI/CD pipeline. The malicious packages were live for roughly five hours before PyPI quarantined them. If you build anything on top of LLM infrastructure, this incident should change how you think about your dependency chain.

What happened

LiteLLM is an open-source LLM proxy that lets you call 100+ model providers through one unified API. It’s installed in roughly 36% of cloud environments that use LLM tooling, with millions of daily downloads from PyPI. It is, by most measures, critical AI infrastructure.

On March 24, a threat actor group known as TeamPCP published two compromised versions of the litellm package to PyPI. These weren’t typosquats or lookalike packages. They were published under the real litellm package name, using stolen maintainer credentials.

The timeline was tight:

10:39 UTC: v1.82.7 published with malicious payload embedded in proxy_server.py
Shortly after: v1.82.8 published with a second, more aggressive injection technique
~16:00 UTC: PyPI quarantined both versions after community reports surfaced on Hacker News and r/LocalLLaMA

Five hours. That’s all it took.

How the attackers got in

This is the part that should make every engineer uncomfortable. The attackers didn’t find a zero-day in LiteLLM’s code. They didn’t brute-force a password. They compromised a security tool.

Five days earlier, on March 19, TeamPCP had already compromised Trivy, a popular open-source vulnerability scanner made by Aqua Security. They used stolen credentials to publish a malicious Trivy version (0.69.4) and force-pushed 76 of 77 aquasecurity/trivy-action tags to point at malicious commits.

LiteLLM used Trivy in its CI/CD pipeline — specifically in GitHub Actions — to scan for vulnerabilities. The irony is hard to miss: the tool meant to protect the supply chain became the entry point.

When LiteLLM’s CI ran the compromised Trivy action, the attackers harvested PyPI credentials from the GitHub Actions environment. With those credentials, they published the poisoned packages directly, bypassing every other safeguard in LiteLLM’s release process.

The attack then cascaded. Between March 19 and March 24, TeamPCP used the same technique to compromise npm packages, Checkmarx, OpenVSX, and finally LiteLLM — a coordinated, multi-ecosystem supply chain campaign.

What the malware actually did

The payload was sophisticated. Not a simple credential dump — a six-stage attack designed for maximum damage in cloud-native environments:

Stage 1 — Reconnaissance. Sweep the host for everything valuable: environment variables, SSH keys, AWS/GCP/Azure credentials, Kubernetes service account tokens, Docker configs, shell history, database passwords, CI/CD secrets, and cryptocurrency wallets.

Stage 2 — Encryption and exfiltration. Package the harvested data, encrypt it with AES-256 (session key protected by RSA-4096), and send it to models.litellm[.]cloud — a domain designed to look like legitimate LiteLLM infrastructure.

Stage 3 — Persistence. Install a systemd service (sysmon.service) that survives reboots, disguised as a system monitoring process.

Stage 4 — Lateral movement. If Kubernetes access is available, deploy privileged pods named node-setup-* on every node in the cluster.

Stage 5 — Command and control. Beacon to checkmarx[.]zone/raw for follow-on payloads — meaning the initial compromise was just the foothold.

The v1.82.8 variant was particularly nasty. Instead of embedding the payload in a Python module (which only executes when imported), it dropped a .pth file into site-packages/. Python’s site module executes .pth files on every interpreter startup. That means the malware ran whenever Python launched — pip install, python -c, even your IDE’s language server. No import required.

Why this matters beyond LiteLLM

It’s tempting to treat this as a LiteLLM-specific story. It isn’t. Here’s what this incident reveals about the state of AI infrastructure:

1. LLM proxies are high-value targets

LiteLLM sits between your application and every LLM provider you use. By design, it has access to API keys for OpenAI, Anthropic, Google, Azure, and dozens of other providers. Compromising the proxy means compromising access to all of them.

This is the fundamental tension of the proxy pattern: centralization is convenient, but it creates a single point of compromise. Every API key, every request, every response flows through one component. If that component is poisoned, everything downstream is exposed.

2. Your security tools are part of your attack surface

LiteLLM was running Trivy to be responsible. Vulnerability scanning in CI/CD is a best practice. But the scanner itself was compromised, and because it ran in a privileged CI environment with access to deployment credentials, it became the perfect vector.

This is a second-order trust problem. You audit your dependencies. But do you audit the tools that audit your dependencies? And the tools those tools depend on? The trust chain is longer than most teams realize.

3. PyPI (and npm, and every package registry) is a trust-on-first-install system

Once an attacker has valid maintainer credentials, there is no additional gate between them and millions of machines. PyPI doesn’t require signed releases. There’s no mandatory two-person review for publishes. The package manager trusts whoever has the token.

Projects like Sigstore and PyPI’s nascent attestation framework are working on this, but adoption is still early. For now, pip install is an act of faith.

4. The blast radius of AI infrastructure is enormous

Traditional web frameworks are valuable targets, but their blast radius is bounded. An LLM proxy compromise is different: it potentially exposes every cloud credential, every API key for every model provider, and every prompt and response flowing through the system. For companies using LiteLLM to route traffic across multiple providers, a single compromise could mean rotating credentials across five or six different AI platforms simultaneously.

Practical lessons for AI engineers

If you’re building AI pipelines, RAG systems, or anything that touches LLM infrastructure, here’s what to take away:

Pin your dependencies aggressively

Don’t use litellm>=1.80 in your requirements. Pin exact versions: litellm==1.82.6. Use hash verification where possible. Yes, this means more manual updates. That’s the point — you want a human in the loop before new code runs in your environment.

# requirements.txt — pin with hashes
litellm==1.82.6 \
    --hash=sha256:abc123...

Minimize credential exposure

The LiteLLM proxy pattern requires passing API keys through a central service. If you can avoid that pattern — by calling providers directly, by using local models, or by keeping credential management separate from request routing — you reduce the blast radius of any single component being compromised.

Not every workload needs a proxy. For many use cases, a direct SDK call with credentials loaded from a vault at runtime is simpler and has a smaller attack surface.

Audit your CI/CD pipeline dependencies

Your GitHub Actions, your Docker base images, your linting tools, your security scanners — these all run with elevated privileges. Treat them with the same scrutiny you’d give a production dependency. Pin action versions to commit SHAs, not tags (tags can be force-pushed, as TeamPCP demonstrated).

# Don't do this — tags can be repointed
- uses: aquasecurity/trivy-action@latest

# Do this — commit SHAs are immutable
- uses: aquasecurity/trivy-action@a1b2c3d4e5f6...

Prefer local-first architectures where possible

This is where my bias shows, so take it accordingly. Tools that process data locally — without requiring API keys to function, without phoning home, without centralized credential stores — have a fundamentally smaller attack surface. There are fewer secrets to steal because there are fewer secrets in the system.

pdfmux, for example, runs entirely on your machine. There are no API keys required for core extraction. No cloud service to authenticate against. No credentials flowing through a proxy. A supply chain attack against pdfmux’s PyPI package would still be serious (any compromised package is), but the attacker wouldn’t find a treasure trove of cloud credentials waiting to be harvested — because they don’t exist in the runtime environment.

This isn’t a criticism of LiteLLM’s architecture. Proxying requests across multiple LLM providers is genuinely useful, and LiteLLM does it well. But the incident is a reminder that architectural choices have security implications. Every API key your system stores is a key an attacker can steal.

Monitor for anomalous package publishes

If you depend on critical packages, set up alerts for new version publishes. Tools like socket.dev and Phylum can flag suspicious changes (new network calls, new file system access, obfuscated code). The community caught the LiteLLM compromise within hours partly because people were already watching after the Trivy incident days earlier.

What LiteLLM did right

Credit where it’s due: LiteLLM’s response was fast and transparent. They published a detailed security advisory within hours, engaged Mandiant for forensic analysis, paused all new releases pending a supply chain review, and provided scanning scripts so affected users could check their environments.

Their Docker image users and cloud platform users were unaffected because those distribution channels weren’t compromised — only the PyPI package. This is actually a good argument for using container images as your distribution mechanism for server-side software: they provide an additional layer of verification that pip install doesn’t.

The community response was also impressive. The initial detection came from developers on Hacker News and Reddit who noticed anomalous behavior, not from automated security scanning. Human vigilance caught what automated tools missed — in part because the automated tools were themselves compromised.

The bigger picture

The LiteLLM incident is the most significant supply chain attack to hit the AI/ML ecosystem so far, but it won’t be the last. As LLM infrastructure becomes more critical to businesses, the packages that form that infrastructure become higher-value targets.

The attack earned CVE-2026-33634 with a CVSS score of 9.4 out of 10. That score reflects what security researchers already knew: AI infrastructure components are becoming as critical as databases and auth systems, but they haven’t yet received the same level of security scrutiny.

Every AI engineer should be asking: what happens if my most trusted dependency publishes a malicious update tomorrow? If you don’t have a clear answer, this week is a good time to figure one out.

If you’re building AI pipelines and want to reduce your dependency on cloud-connected tools, pdfmux handles PDF extraction locally — no API keys, no GPU, no cloud service. pip install pdfmux to get started, or check the docs for integration guides.