Jorge Morais — Senior Full-Stack Developer at the intersection of AI and Industrial IoT · Remote, EU

depguard - MCP Security Server for AI Coding Agents

How a token-saving experiment became an open-source npm security tool

Active (v1.0 → v1.9)
Author & Maintainer
2026
TypeScriptNode.jsMCP ProtocolJSON-RPC 2.0npm Registry APIGitHub Advisory APICycloneDX 1.6
depguard - MCP Security Server for AI Coding Agents preview

Context

I use Claude Code daily as a development companion. One thing I noticed quickly: AI agents burn through tokens fast when researching npm packages. Every time the agent needs to decide whether to install a dependency, it runs WebSearch, reads the npm page, checks for vulnerabilities, compares alternatives. Easily 10,000+ tokens per package decision. I thought: what if I could give the agent a single tool that answers all those questions in one call? That was the first seed of depguard, a simple MCP server that audits npm packages and saves tokens. But then something happened that changed the project's direction entirely.

Technical Challenges

Token Cost of Manual Research

Each npm package decision cost the AI agent 10,000+ tokens in web searches, page fetches, and reasoning. Multiply by dozens of installs per session.

Supply Chain Attack Incident

A widely-used npm package I depended on had a critical vulnerability in a specific version range. It was reported on GitHub Advisory but not on npm audit. I realized AI agents install packages blindly without checking multiple advisory sources.

AI Hallucinated Package Names

AI agents sometimes suggest package names that do not exist on npm. These phantom names are a real supply chain attack vector. An attacker can register the hallucinated name with malicious code.

The Solution

Technology Stack

Core

  • TypeScript (strict mode)
  • Node.js 18+ built-ins only
  • Zero runtime dependencies

Protocol

  • MCP (Model Context Protocol)
  • JSON-RPC 2.0 over stdio

Security

  • npm Registry API
  • GitHub Advisory Database
  • CVSS scoring
  • Levenshtein distance
  • 18+ malware patterns
  • CycloneDX 1.6 SBOM

Testing

  • node:test (built-in)
  • 298 offline tests
  • Mock fetch injection

Code Example

The guard function: pre-install check that verifies package existence, detects typosquatting, and runs a quick audit before allowing installation.

TypeScript
// Pre-install guardian: the core of depguard's prevention layer
async function guard(packageName: string, options: GuardOptions): Promise<GuardResult> {
  const reasons: string[] = []
  let decision: 'allow' | 'warn' | 'block' = 'allow'

  // Step 1: Does this package even exist? (AI hallucination guard)
  const verifyResult = await verify(packageName)
  if (!verifyResult.exists) {
    return { decision: 'block', reasons: ['Package does not exist on npm'] }
  }

  // Step 2: Is it a typosquat? (Levenshtein against 100+ popular packages)
  if (verifyResult.possibleTyposquat) {
    reasons.push('Possible typosquat of: ' + verifyResult.similarTo.join(', '))
    decision = 'warn'
  }

  // Step 3: Quick audit + score
  const [auditReport, scoreReport] = await Promise.all([
    audit(packageName, targetLicense, fetcher),
    score(packageName, { targetLicense, fetcher }),
  ])

  // Critical vulns = automatic block. No exceptions.
  if (auditReport.vulnerabilities.critical > 0) {
    decision = 'block'
  }

  // Score below threshold = warn or block
  if (scoreReport.total < threshold - 20) decision = 'block'
  else if (scoreReport.total < threshold) decision = 'warn'

  return { decision, score: scoreReport.total, reasons, auditSummary }
}

Key Technical Decisions

Zero runtime dependencies

Rationale: A security tool cannot be a supply chain risk itself. Using only Node.js built-ins (fetch, crypto, readline, fs) eliminates the attack surface entirely.
Trade-off: More code to write (custom semver parser, MCP protocol handler, disk cache), but the trust factor is worth it.

Security ceiling on scoring

Rationale: Initial scoring allowed popular packages with critical vulnerabilities to score above 60 (installable). A package with 1 critical vuln could score 66/100 because other dimensions compensated. Fixed by adding a hard ceiling: critical = max 30, high = max 50.
Trade-off: Some legitimate packages with known-but-mitigated vulns get low scores, but false safety is worse than false alarms.

Conservative sweep (when in doubt, stay silent)

Rationale: Dead dependency detection could recommend removing packages that are actually used in ways static analysis cannot detect (dynamic imports, CI scripts, config files). Solution: classify uncertain cases as "maybe-unused" and always include a safety note.
Trade-off: Lower recall (some truly unused deps are marked as maybe-unused), but zero risk of recommending removal of critical dependencies.

Measurable Results

12MCP Tools
0Runtime Dependencies
298Offline Tests
~99%Token Savings per Audit
25+License Types
100+Typosquat Watchlist

Business Impact

depguard saves approximately 11,000 tokens per package audit compared to manual AI research (WebSearch + WebFetch + reasoning). For a typical project with 30 dependencies, that is 330,000 tokens saved in a single project audit. Published on npm as depguard-cli, with a dedicated product page at depguard.dev. The tool is used as an MCP server integrated directly into AI coding workflows.

Technical Achievements

  • Dual advisory sources (npm + GitHub) catch vulnerabilities that single-source tools miss
  • Levenshtein-based typosquatting detection against 100+ popular packages with zero false positives on exact matches
  • Dead dependency detection with awareness of config files, npm scripts, peer deps, workspaces, Vue/Svelte/Astro files, and SCSS imports
  • CVSS score integration for granular security assessment beyond simple severity labels
  • Dual license parsing handles SPDX expressions like MIT OR GPL-3.0 correctly
  • CycloneDX 1.6 SBOM generation with PURLs, SHA-512 integrity hashes, and inline VEX vulnerability data. output validates against the official schema and is consumed unchanged by Dependency-Track, Trivy, Grype, OWASP DT
  • Static code analysis: tarball download + 18+ malware pattern scan catches obfuscation and behavioral mismatches before install
  • AI Code Review tool surfaces debris left by AI agents (rogue console.logs, empty catches, broken imports, orphan files)
  • All 298 tests run offline with mock fetch. Zero flaky network tests in CI

Key Learnings

  • 1The best tools are born from real frustration. depguard started as a token-saving hack and became a security platform because the problem kept expanding.
  • 2Scoring algorithms must have hard ceilings for critical dimensions. Weighted averages allow bad scores in one dimension to be masked by good scores in others. Security cannot be averaged away.
  • 3Static analysis will always have blind spots (dynamic imports, CI-only deps). The right approach is not to pretend you catch everything, but to clearly communicate what you verified and what you did not.
  • 4Zero dependencies is not just a feature. It is a trust signal. When your tool audits other packages for supply chain risks, having your own dependency tree undermines credibility.
  • 5AI agents hallucinate package names. This is not a theoretical risk. It is a documented supply chain attack vector. Verifying package existence before install is a simple check with outsized impact.
  • 6Advisory databases disagree. npm and GitHub report different vulnerabilities for the same package. Deduplication by CVE ID + GHSA ID + URL is essential to avoid double-counting.