cscott/ai-presentations

Fork 0

Files

Clawdbot 27b8eb3e05 Add 03-skills presentation: Skills for AI agents

2026-02-25 22:15:21 +00:00

7.1 KiB

Raw Blame History

Skills: Teaching AI How to Think

Core thesis: Skills are lazy-loaded expertise packs that make AI agents both more efficient AND more reliable.

What Are Skills?

A skill is a bundled unit of domain expertise. At its core, it's a SKILL.md file — a structured set of instructions, business rules, edge cases, and guardrails — plus optional scripts, templates, and assets.

Skills are not always loaded into context. They sit on a shelf, described by a short summary. When a task arrives, the agent scans those summaries, identifies which skill (if any) applies, and loads it just-in-time.

Think of it this way: a professional doesn't carry every manual in their backpack. They know which shelf to reach for. Skills give an AI agent that same organisational awareness.

Anatomy of a Skill

weather/
├── SKILL.md          # The SOP — instructions, rules, edge cases
├── scripts/
│   └── fetch.py      # Optional helper scripts
├── templates/
│   └── report.md     # Output templates
└── assets/
    └── icons/        # Supporting files

The SKILL.md is the brain. Everything else is optional scaffolding.

Efficiency Through Lazy Loading

Without skills, you face an uncomfortable trade-off:

Bloat the system prompt with every possible instruction — wasting tokens, polluting context, and degrading performance on all tasks to marginally improve some tasks.
Leave the agent ignorant — it doesn't know your SOPs, your preferred approaches, your edge cases. It improvises. Sometimes well. Often not.

Skills eliminate this trade-off entirely.

The agent's system prompt contains only skill descriptions — a sentence or two each. When a matching task arrives, the full skill loads into context. When the task is done, it's gone. The agent operates with a lean context window most of the time and expert-level depth exactly when needed.

The Numbers

Consider an agent with 10 skills, each averaging 2,000 tokens of instructions:

Approach	Tokens in Context	Quality
Everything in system prompt	20,000+ always	Degraded (noise)
No skills at all	~0	Poor (no expertise)
Lazy-loaded skills	~500 base + 2,000 when needed	Optimal

That's a 10x reduction in baseline context usage with better outcomes.

Reliability Through SOPs

The SKILL.md is the standard operating procedure. It doesn't just hint at what to do — it encodes:

Business rules — "Always use metric units for Australian users"
Edge cases — "If the API returns a 429, wait 60 seconds before retry"
Preferred approaches — "Use ripgrep over grep for speed"
Guardrails — "Never delete without confirmation; use trash over rm"
Decision trees — "If X, do Y. If Z, escalate."

This is the difference between giving a junior engineer a runbook and saying "figure it out." Both might get to the same destination, but one gets there reliably, consistently, and without the creative detours that break production.

Without a Skill (Improvisation)

User: Check if my server is secure.

Agent: runs a few random checks it remembers from training data, misses half the important ones, suggests changes that conflict with your infrastructure

With the Healthcheck Skill (SOP)

User: Check if my server is secure.

Agent: loads healthcheck SKILL.md → follows structured audit: firewall rules → SSH config → update status → service exposure → generates prioritised report with specific remediation steps

Same request. Wildly different reliability.

Skills vs. Tools vs. System Prompts

These three layers serve fundamentally different purposes:

Layer	Question It Answers	Example
Tools	What can I do?	"I can read files, search the web, send messages"
System Prompt	Who am I?	"You are helpful, concise, and safety-conscious"
Skills	How do I do specific things well?	"Here's how to audit server security step-by-step"

All three are needed. Tools without skills is like giving someone a workshop full of power tools but no training. Skills without tools is expertise with no hands. And the system prompt ties it all together with identity and baseline behaviour.

Real-World Examples

Weather Skill (Simple)

A lightweight skill that knows how to query weather APIs, format forecasts, handle location lookups, and present results cleanly. Maybe 50 lines of instructions. Loaded when someone asks "what's the weather?"

SecureTransport Flow Engineering (Complex)

A deep domain skill encoding expertise in Axway SecureTransport — file transfer flows, PGP encryption steps, external script configuration, SFTP ingress patterns, error log locations, and testing harnesses. This is tribal knowledge that took months to accumulate, now available to any agent instantly.

Healthcheck Skill (Security SOPs)

A structured security audit playbook — firewall configuration, SSH hardening, package update status, service exposure analysis. Follows a defined checklist, produces prioritised findings, and recommends specific remediations aligned with the deployment's risk tolerance.

Expertise Preservation

Here's where skills become strategically important, not just operationally convenient.

Skills capture tribal knowledge.

When your best engineer writes a skill, they're encoding their expertise — the shortcuts, the gotchas, the "here's what the documentation doesn't tell you" — into a format that any agent can use, forever.

People leave. People forget. People get busy. But a well-written skill persists. It's organisational knowledge management that actually works, because the consumer (the AI agent) follows instructions literally and completely.

This isn't about replacing experts. It's about scaling their expertise. One expert writes the skill. Every agent in the organisation benefits.

Composability and Community

Skills are modular by design:

Shareable — Package a skill and hand it to another team or publish it
Versionable — Track changes, roll back, evolve with your processes
Stackable — Multiple skills can be available simultaneously; the agent picks the right one
Discoverable — Skill descriptions form a searchable catalogue

ClawHub: A Marketplace of Expertise

Skills can be shared through ClawHub — discovered, installed, and composed by anyone running an OpenClaw agent. This creates a flywheel:

Someone solves a problem well and writes a skill
They share it
Others use it, improve it, contribute back
The collective expertise grows

It's open-source knowledge, but structured for AI consumption.

Summary

Skills solve two problems at once:

Efficiency — Load expertise on-demand instead of bloating every interaction
Reliability — Follow defined SOPs instead of improvising on critical tasks

They also unlock something bigger: a way to capture, share, and scale human expertise through AI agents. Not replacing the expert — amplifying them.

The question isn't whether your agents need skills. It's what expertise you'd encode first.

7.1 KiB Raw Blame History