John

Senior Cloud Engineer & Technical Lead

Skills.sh: The Missing Package Manager for AI Agent Capabilities

I was setting up Claude Code on a new project last week and caught myself doing the same thing I always do: copying chunks of instructions from one CLAUDE.md to another, tweaking them slightly, and hoping I hadn’t introduced inconsistencies. I had a set of workflows I’d refined over months – how to write commits, how to handle Terraform plans, how to structure PR descriptions – and I was manually transplanting them between repos like it was 2019 and we hadn’t invented package managers yet.

Then I found skills.sh, and the first thing I thought was: this is what I’ve been doing by hand, except someone finally built the registry for it.

What Skills.sh Actually Is

Skills.sh is an open-source project launched by Vercel that gives AI coding agents a standardized way to discover, install, and use reusable capabilities. Think of it as npm for agent behaviors. Instead of packaging JavaScript libraries, it packages workflows, best practices, and domain expertise into modules that any compatible agent can load and execute.

The key insight is that the AI agent ecosystem had a distribution problem. We’ve all built custom instructions, prompt patterns, and workflows that make our agents more effective, but there was no standard way to share them. Skills.sh solves that with a registry, a CLI, and a dead-simple file format.

It works with Claude Code, Cursor, GitHub Copilot, Aider, OpenCode, and over thirty other agents out of the box. One command to install, and the skill is available across your entire project.

The Problem It Solves

To understand why this matters, think about what happened before skills.sh existed. Every team that wanted their AI agent to follow specific patterns had to build that knowledge from scratch. Your commit message conventions, your testing strategy, your deployment workflow – all of it lived in bespoke CLAUDE.md files, .cursorrules, or scattered system prompts that nobody outside your team could benefit from.

This created three problems:

Duplication of effort. Thousands of teams independently writing instructions for common tasks like “write good PR descriptions” or “follow React best practices.” Each team doing the work alone, producing slightly different results, with no way to build on each other’s improvements.

No quality signal. When you write your own agent instructions, you have no idea if they’re effective compared to alternatives. There’s no download count, no community feedback, no iteration from hundreds of users. You’re flying blind.

Knowledge silos. The best agent workflows lived in private repos, never reaching the broader community. A team at Stripe might have perfected how agents should handle API documentation, but that expertise stayed locked inside Stripe’s codebase.

Skills.sh cracks open all three problems at once.

graph TD subgraph Before["Before Skills.sh"] direction TB B1["Team A writes agent instructions"] B2["Team B writes same instructions"] B3["Team C writes same instructions"] B4["No sharing, no standards"] B1 --> B4 B2 --> B4 B3 --> B4 end subgraph After["After Skills.sh"] direction TB A1["Author publishes skill to registry"] A2["Team A installs skill"] A3["Team B installs skill"] A4["Team C installs skill"] A5["Feedback improves skill for everyone"] A1 --> A2 A1 --> A3 A1 --> A4 A2 --> A5 A3 --> A5 A4 --> A5 A5 --> A1 end style Before fill:#4a1a1a,stroke:#8b3a3a,color:#fff style After fill:#1a4a1a,stroke:#3a8b3a,color:#fff

How It Works

The entire system is built around a single file format: SKILL.md. If you’ve written a CLAUDE.md or any markdown-based agent instructions, you already know 90% of what you need.

The SKILL.md Format

A skill is a directory containing a SKILL.md file with YAML frontmatter and markdown instructions. Here’s the anatomy:

my-skill/
├── SKILL.md              # Required: the skill definition
└── resources/            # Optional: bundled files
    ├── scripts/          # Executable code the skill can reference
    ├── references/       # Documentation for additional context
    └── assets/           # Templates, configs, etc.

The SKILL.md itself looks like this:

---
name: explain-code
description: Explains code with visual diagrams and analogies.
  Use when explaining how code works, teaching about a codebase,
  or when the user asks "how does this work?"
---

When explaining code, always include:
1. **Start with an analogy**: Compare the code to something from everyday life
2. **Draw a diagram**: Use ASCII art or mermaid to show the flow
3. **Walk through the code**: Explain step-by-step
4. **Highlight a gotcha**: Point out a common mistake or misconception

The description field is the critical piece. It’s the triggering mechanism – it tells the agent when this skill is relevant. When you ask your agent to explain how some code works, it reads the description, matches it to your request, and loads the full instructions.

The Three-Level Loading System

This is the part that impressed me from an engineering perspective. Skills use a progressive loading model that keeps context windows lean:

graph LR subgraph Level1["Level 1: Metadata"] M1["Name + Description"] M2["~100 words"] M3["Always in context"] end subgraph Level2["Level 2: Body"] B1["Full SKILL.md instructions"] B2["Loaded when triggered"] B3["Core workflow logic"] end subgraph Level3["Level 3: Resources"] R1["Scripts, docs, templates"] R2["Loaded as needed"] R3["Referenced by body"] end Level1 -->|"Agent matches request"| Level2 Level2 -->|"Skill needs files"| Level3 style Level1 fill:#1a365d,stroke:#3182ce,color:#fff style Level2 fill:#2a4365,stroke:#3182ce,color:#fff style Level3 fill:#3a5275,stroke:#3182ce,color:#fff

Level 1 (Metadata) is always loaded – just the name and description, roughly 100 words. This is what the agent uses to decide whether a skill is relevant to the current task.

Level 2 (Body) loads when the skill is triggered. This is the full set of instructions from the SKILL.md markdown content.

Level 3 (Resources) loads on demand when the skill’s instructions reference bundled scripts, documentation, or templates.

This matters because context windows are finite. You don’t want twenty skills’ worth of instructions eating up tokens when you’re only using one. The progressive loading means you can have dozens of skills installed without paying a context penalty until you actually need them.

CLI Commands

Getting started is one command:

# Install a skill from GitHub
npx skills add vercel-labs/agent-skills

# Search for skills in the registry
npx skills find "terraform"

# Check for updates to installed skills
npx skills check

# Update all installed skills
npx skills update

Installation sources are flexible. You can install from GitHub shorthand, full URLs, GitLab, any git URL, or even local paths:

# GitHub shorthand
npx skills add vercel-labs/agent-skills

# Full GitHub URL
npx skills add https://github.com/vercel-labs/agent-skills

# Specific skill within a repo
npx skills add vercel-labs/agent-skills/react-performance

# Local path for development
npx skills add ./my-local-skill

Skills vs MCP: Complementary, Not Competitive

This is the question everyone asks first, and the distinction matters. MCP (Model Context Protocol) and skills.sh solve different problems at different layers of the stack.

MCP solved “how do agents talk to tools.” It provides standardized interfaces for AI agents to access external services – databases, APIs, file systems, cloud providers. MCP is a transport and interface layer. It gets your agent connected to the data.

Skills.sh solves “how do developers share agent capabilities.” It packages workflows, best practices, and domain expertise into reusable modules. A skill tells the agent what to do and how to think about a problem.

They work together naturally. A skill can reference MCP servers, incorporate system prompts, and orchestrate complex workflows that involve multiple tools. Here’s the mental model:

graph TB subgraph Agent["AI Agent"] S["Skill: Deploy to AWS"] end subgraph Skills["Skills Layer — What to Do"] S1["1. Run tests first"] S2["2. Build container image"] S3["3. Push to ECR"] S4["4. Update ECS service"] S5["5. Verify health check"] end subgraph MCP["MCP Layer — How to Connect"] M1["AWS MCP Server"] M2["Docker MCP Server"] M3["GitHub MCP Server"] end subgraph Infra["Infrastructure"] I1["ECR Registry"] I2["ECS Cluster"] I3["GitHub Actions"] end Agent --> Skills S1 --> M3 S2 --> M2 S3 --> M1 S4 --> M1 S5 --> M1 M1 --> I1 M1 --> I2 M3 --> I3 style Skills fill:#2a4365,stroke:#3182ce,color:#fff style MCP fill:#1a4a1a,stroke:#3a8b3a,color:#fff style Agent fill:#4a3a1a,stroke:#8b7a3a,color:#fff

MCP gets the agent to the data. A skill tells the agent what to do with it once it’s there. A deployment skill might orchestrate five steps that each use MCP servers to interact with different services. The skill encodes the workflow logic – the ordering, the error handling, the best practices – while MCP handles the actual connections.

Invocation Controls

One detail that matters for production use: skills have explicit controls over who can trigger them and how.

---
name: deploy-production
description: Deploy the current branch to production environment
disable-model-invocation: true
---

Setting disable-model-invocation: true means only the user can invoke this skill. The agent won’t trigger it autonomously, no matter how relevant it seems. This is critical for skills with side effects – deployments, database migrations, anything destructive. You want a human in the loop for those.

The inverse is also available:

---
name: code-style-guide
description: Internal coding style conventions and patterns
user-invocable: false
---

Setting user-invocable: false makes the skill background knowledge that the agent applies automatically but the user never triggers directly. Style guides, architecture conventions, and coding standards fit this pattern perfectly. You want the agent to always follow them without requiring you to explicitly invoke a skill every time.

The Ecosystem Is Moving Fast

The adoption numbers tell the story. Within six hours of Vercel announcing skills.sh, the top skill had over 20,000 installs. Stripe shipped their own skills the same day. As of now, the trending leaderboard looks like this:

Skill Installs
vercel-labs/skills 12,200+
1nference-sh/skills 5,000+
vercel-labs/agent-skills 4,100+
anthropics/skills 3,200+
remotion-dev/skills 2,500+
supabase/agent-skills 826+

The vercel-labs/agent-skills package is particularly interesting – it encodes ten years of Vercel’s engineering experience with React and Next.js performance optimization into skills that any agent can use. That’s the kind of institutional knowledge that used to live in internal wikis and senior engineers’ heads.

Writing Your Own Skills

If you’ve been building CLAUDE.md files or custom agent instructions, you already have skills waiting to be extracted. Here’s a practical example of turning a common workflow into a skill.

Say you have a Terraform workflow that you always follow. You could package it as a skill:

---
name: terraform-plan-review
description: Reviews Terraform plan output for safety and best practices.
  Use when reviewing terraform plan output, checking for destructive changes,
  or when the user asks to review infrastructure changes.
---

When reviewing a Terraform plan:

1. **Check for destructive actions first**
   - Flag any `destroy` or `replace` actions immediately
   - Highlight resources that will be recreated due to force-new attributes
   - Call out any changes to stateful resources (databases, storage)

2. **Validate the change scope**
   - Confirm the number of changes matches expectations
   - Flag unexpected resources appearing in the plan
   - Check for cascading changes from module updates

3. **Review security implications**
   - Check security group changes for overly permissive rules
   - Validate IAM policy changes aren't granting excessive permissions
   - Flag any resources being created without encryption

4. **Summarize for the reviewer**
   - Provide a concise summary: X to add, Y to change, Z to destroy
   - List the most impactful changes first
   - Recommend whether to proceed, investigate, or abort

Once you’ve written it, publishing is straightforward: push it to a GitHub repo and anyone can install it with npx skills add your-username/your-skill-repo.

Where This Is Heading

What excites me most about skills.sh isn’t the current state – it’s the trajectory. We’re watching the emergence of an ecosystem that could do for AI agent behaviors what npm did for JavaScript libraries and Docker Hub did for container images.

The implications are significant:

Team standardization. Instead of hoping everyone configures their AI agent the same way, you install a shared skill set and the entire team’s agents behave consistently. Onboarding a new developer means running npx skills add your-org/engineering-skills instead of walking them through a wiki.

Vendor expertise distribution. Cloud providers, SaaS companies, and framework authors can ship skills alongside their products. The Supabase team shipping agent skills means your AI assistant understands Supabase idioms natively, without you having to teach it.

Community-driven quality. The best patterns will surface through install counts and community feedback. Instead of every team independently figuring out the best way to write PR descriptions or handle database migrations, the community converges on proven approaches.

Key Learnings

  • Skills.sh treats agent capabilities like packages – discoverable, installable, versioned, and shareable, solving the distribution problem that has plagued AI agent customization
  • The SKILL.md format is intentionally simple – YAML frontmatter plus markdown instructions means anyone who has written a README can write a skill; the low barrier to entry is a feature, not a limitation
  • Progressive loading respects context windows – the three-level system (metadata always loaded, body on trigger, resources on demand) means you can install many skills without wasting tokens
  • Skills and MCP are complementary layers – MCP handles connectivity to external tools and services, while skills encode the workflows and best practices for using those connections effectively
  • Invocation controls matter for production safetydisable-model-invocation prevents agents from autonomously triggering destructive operations, keeping humans in the loop where it counts
  • The ecosystem effect is the real value – individual skills are useful, but the registry and discovery mechanism create a flywheel where shared expertise improves everyone’s AI-assisted development
  • If you’ve written CLAUDE.md files, you already know how to write skills – the format is familiar enough that extracting your existing agent workflows into shareable skills is a low-effort, high-impact move

Skills.sh is still early, but the pattern it establishes feels right. We’ve spent the last couple of years figuring out how to make AI agents useful for individual developers. The next phase is making that usefulness portable, shareable, and composable. Skills.sh is the infrastructure for that phase, and the adoption speed suggests the community was ready for it.