Files
ai-presentations/03-skills/README.md
2026-02-25 22:15:21 +00:00

161 lines
7.1 KiB
Markdown

# Skills: Teaching AI How to Think
**Core thesis: Skills are lazy-loaded expertise packs that make AI agents both more efficient AND more reliable.**
---
## What Are Skills?
A **skill** is a bundled unit of domain expertise. At its core, it's a `SKILL.md` file — a structured set of instructions, business rules, edge cases, and guardrails — plus optional scripts, templates, and assets.
Skills are **not** always loaded into context. They sit on a shelf, described by a short summary. When a task arrives, the agent scans those summaries, identifies which skill (if any) applies, and loads it just-in-time.
Think of it this way: a professional doesn't carry every manual in their backpack. They know which shelf to reach for. Skills give an AI agent that same organisational awareness.
### Anatomy of a Skill
```
weather/
├── SKILL.md # The SOP — instructions, rules, edge cases
├── scripts/
│ └── fetch.py # Optional helper scripts
├── templates/
│ └── report.md # Output templates
└── assets/
└── icons/ # Supporting files
```
The `SKILL.md` is the brain. Everything else is optional scaffolding.
---
## Efficiency Through Lazy Loading
Without skills, you face an uncomfortable trade-off:
1. **Bloat the system prompt** with every possible instruction — wasting tokens, polluting context, and degrading performance on *all* tasks to marginally improve *some* tasks.
2. **Leave the agent ignorant** — it doesn't know your SOPs, your preferred approaches, your edge cases. It improvises. Sometimes well. Often not.
Skills eliminate this trade-off entirely.
The agent's system prompt contains only **skill descriptions** — a sentence or two each. When a matching task arrives, the full skill loads into context. When the task is done, it's gone. The agent operates with a lean context window most of the time and expert-level depth exactly when needed.
### The Numbers
Consider an agent with 10 skills, each averaging 2,000 tokens of instructions:
| Approach | Tokens in Context | Quality |
|---|---|---|
| Everything in system prompt | 20,000+ always | Degraded (noise) |
| No skills at all | ~0 | Poor (no expertise) |
| Lazy-loaded skills | ~500 base + 2,000 when needed | Optimal |
That's a 10x reduction in baseline context usage with *better* outcomes.
---
## Reliability Through SOPs
The `SKILL.md` **is** the standard operating procedure. It doesn't just hint at what to do — it encodes:
- **Business rules** — "Always use metric units for Australian users"
- **Edge cases** — "If the API returns a 429, wait 60 seconds before retry"
- **Preferred approaches** — "Use `ripgrep` over `grep` for speed"
- **Guardrails** — "Never delete without confirmation; use trash over rm"
- **Decision trees** — "If X, do Y. If Z, escalate."
This is the difference between giving a junior engineer a runbook and saying "figure it out." Both might get to the same destination, but one gets there reliably, consistently, and without the creative detours that break production.
### Without a Skill (Improvisation)
> **User:** Check if my server is secure.
>
> **Agent:** *runs a few random checks it remembers from training data, misses half the important ones, suggests changes that conflict with your infrastructure*
### With the Healthcheck Skill (SOP)
> **User:** Check if my server is secure.
>
> **Agent:** *loads healthcheck SKILL.md → follows structured audit: firewall rules → SSH config → update status → service exposure → generates prioritised report with specific remediation steps*
Same request. Wildly different reliability.
---
## Skills vs. Tools vs. System Prompts
These three layers serve fundamentally different purposes:
| Layer | Question It Answers | Example |
|---|---|---|
| **Tools** | What *can* I do? | "I can read files, search the web, send messages" |
| **System Prompt** | Who *am* I? | "You are helpful, concise, and safety-conscious" |
| **Skills** | *How* do I do specific things well? | "Here's how to audit server security step-by-step" |
All three are needed. Tools without skills is like giving someone a workshop full of power tools but no training. Skills without tools is expertise with no hands. And the system prompt ties it all together with identity and baseline behaviour.
---
## Real-World Examples
### Weather Skill (Simple)
A lightweight skill that knows how to query weather APIs, format forecasts, handle location lookups, and present results cleanly. Maybe 50 lines of instructions. Loaded when someone asks "what's the weather?"
### SecureTransport Flow Engineering (Complex)
A deep domain skill encoding expertise in Axway SecureTransport — file transfer flows, PGP encryption steps, external script configuration, SFTP ingress patterns, error log locations, and testing harnesses. This is tribal knowledge that took months to accumulate, now available to any agent instantly.
### Healthcheck Skill (Security SOPs)
A structured security audit playbook — firewall configuration, SSH hardening, package update status, service exposure analysis. Follows a defined checklist, produces prioritised findings, and recommends specific remediations aligned with the deployment's risk tolerance.
---
## Expertise Preservation
Here's where skills become strategically important, not just operationally convenient.
**Skills capture tribal knowledge.**
When your best engineer writes a skill, they're encoding their expertise — the shortcuts, the gotchas, the "here's what the documentation doesn't tell you" — into a format that any agent can use, forever.
People leave. People forget. People get busy. But a well-written skill persists. It's organisational knowledge management that actually works, because the consumer (the AI agent) follows instructions literally and completely.
This isn't about replacing experts. It's about **scaling** their expertise. One expert writes the skill. Every agent in the organisation benefits.
---
## Composability and Community
Skills are modular by design:
- **Shareable** — Package a skill and hand it to another team or publish it
- **Versionable** — Track changes, roll back, evolve with your processes
- **Stackable** — Multiple skills can be available simultaneously; the agent picks the right one
- **Discoverable** — Skill descriptions form a searchable catalogue
### ClawHub: A Marketplace of Expertise
Skills can be shared through ClawHub — discovered, installed, and composed by anyone running an OpenClaw agent. This creates a flywheel:
1. Someone solves a problem well and writes a skill
2. They share it
3. Others use it, improve it, contribute back
4. The collective expertise grows
It's open-source knowledge, but structured for AI consumption.
---
## Summary
Skills solve two problems at once:
- **Efficiency** — Load expertise on-demand instead of bloating every interaction
- **Reliability** — Follow defined SOPs instead of improvising on critical tasks
They also unlock something bigger: a way to capture, share, and scale human expertise through AI agents. Not replacing the expert — amplifying them.
The question isn't whether your agents need skills. It's what expertise you'd encode first.