Bumblebee: Perplexity's Supply-Chain Security Scanner for Developer Machines
Bumblebee: Developer Supply-Chain Scanner
Bumblebee https://github.com/perplexityai/bumblebee is a read-only scanner from Perplexity that inventories packages, extensions, and developer-tool metadata on macOS and Linux machines. When a supply-chain advisory names a compromised package or version, Bumblebee tells you which developer machines have it on disk.
SBOMs answer what shipped. EDR answers what ran. Supply-chain response often needs a third view: messy local state across lockfiles, package-manager metadata, extension manifests, and MCP configs. Bumblebee turns that scattered state into structured NDJSON records and, given an exposure catalog, flags exact matches.
Key Features
- Zero Dependencies - Single static Go binary, no runtime deps beyond Go 1.25
- Read-Only Design - Scans on-disk metadata only, never executes package managers
- Three Scan Profiles - baseline (global), project (workspaces), deep (incident response with exposure catalogs)
- Broad Coverage - npm, pnpm, Yarn, Bun, PyPI, Go modules, RubyGems, Composer, MCP configs, editor extensions, browser extensions
- Exposure Catalogs - Compare inventory against known-compromised package lists
- NDJSON Output - Structured records per package and per finding, pipeable into SIEM pipelines
- Content-Addressed IDs - Stable record IDs across runs for deduplication
What It Scans
Bumblebee reads lockfiles and metadata files. It does not run npm ls, pip show, or go list. It covers:
| Ecosystem | Sources |
|---|---|
| npm | package-lock.json, pnpm-lock.yaml, yarn.lock, bun.lock + node_modules/**/package.json |
| PyPI | *.dist-info/METADATA, INSTALLER, direct_url.json, *.egg-info/PKG-INFO |
| Go | go.sum, go.mod |
| RubyGems | Gemfile.lock, installed *.gemspec |
| Composer | composer.lock, vendor/composer/installed.json |
| MCP | mcp.json, claude_desktop_config.json, cline_mcp_settings.json, Gemini CLI settings |
| Editor Extensions | VS Code, Cursor, Windsurf, VSCodium manifests |
| Browser Extensions | Chromium manifest.json, Firefox extensions.json per profile |
Get Started
Installation
Requires Go 1.25+ on macOS or Linux:
# Install latest release
go install github.com/perplexityai/bumblebee/cmd/bumblebee@latest
# Or pin a specific version
go install github.com/perplexityai/bumblebee/cmd/[email protected]If bumblebee is not found after install, ensure $GOBIN is in your $PATH:
export PATH=$PATH:$(go env GOPATH)/binMake it permanent:
echo 'export PATH=$PATH:$(go env GOPATH)/bin' >> ~/.zshrcSmoke Test
Run the built-in self-test against embedded fixtures:
bumblebee selftest
# selftest OK (3 findings in 1ms)The fixtures use deliberately fake package names and make no network calls. A non-zero exit means something is wrong with the install.
Usage
Baseline Scan (Global Inventory)
Scans common global package roots, language toolchains, editor/browser extensions, and MCP configs:
bumblebee scan --profile baseline > inventory.ndjsonThis is the command to run on a schedule (cron, launchd, systemd).
Project Scan
Target specific development directories:
bumblebee scan --profile project \
--root "$HOME/code" \
--root "$HOME/Developer"Deep Scan (Incident Response)
For on-demand checks against known compromises:
bumblebee scan --profile deep \
--root "$HOME" \
--exposure-catalog ./threat_intel/ \
--findings-onlySupply --exposure-catalog a JSON file or a directory of *.json catalog files. The --findings-only flag suppresses normal package records and shows only matches.
Filter by Ecosystem
Limit a run to specific package managers:
bumblebee scan --profile baseline --ecosystem npm,pypi
bumblebee scan --profile baseline --ecosystem goPreview Scan Roots
See what directories Bumblebee will scan without actually scanning:
bumblebee roots --profile baselineThreat Intel Catalogs
The repo ships with maintained exposure catalogs in threat_intel/. These are JSON files built from public threat-intelligence reporting on recent supply-chain campaigns. The format is straightforward:
{
"schema_version": "0.1.0",
"entries": [
{
"id": "advisory-2026-0042",
"name": "example-pkg 1.2.3 (compromised release)",
"ecosystem": "npm",
"package": "example-pkg",
"versions": ["1.2.3"],
"severity": "critical"
}
]
}Point --exposure-catalog at the threat_intel/ directory and Bumblebee will match every inventoried package against every entry. Findings output includes the catalog ID, severity, and evidence.
Output Format
Records are NDJSON, one per line. Package records look like this:
{
"record_type": "package",
"ecosystem": "npm",
"package_name": "@tanstack/query-core",
"version": "5.59.20",
"source_type": "pnpm-lockfile",
"confidence": "high",
"endpoint": {
"hostname": "my-mbp",
"os": "darwin",
"arch": "arm64"
}
}Finding records add the exposure match details:
{
"record_type": "finding",
"finding_type": "package_exposure",
"severity": "critical",
"catalog_id": "advisory-2026-0042",
"package_name": "example-pkg",
"version": "1.2.3",
"evidence": "exact name+version match"
}What I Found Running It
I tested Bumblebee on a developer machine. Here is what the numbers looked like:
- Self-test passed with 3 findings in 1ms
- Baseline scan found 615 package records (354 npm, 139 PyPI, 122 Go)
- Deep scan with threat intel catalogs across 50,999 files returned 0 findings
- All 19 test packages pass
The zero findings part is good news. It means none of the scanned packages matched known-compromised versions in the current threat catalogs. Running this on your own machine once a month is a cheap way to keep that peace of mind.
Platforms
- 🍎 macOS (Apple Silicon & Intel)
- 🐧 Linux
🔗 GitHub: github.com/perplexityai/bumblebee
Why This Tool Rocks
- Fills a gap between SBOM and EDR for supply-chain incident response
- Reads messy local state that existing tools overlook (browser extensions, MCP configs)
- Ships with actual threat intel catalogs, not just a framework
- Stable record IDs mean you can deduplicate across scan runs
- Single static binary, no runtime dependencies
- Apache 2.0 licensed, written in Go with zero non-stdlib dependencies
Crepi il lupo! 🐺