Skills Are the Wrong Abstraction for Agentic Products

I keep hearing the same word in product design discussions, architecture sessions, standups: skills. PMs spec them. Engineers build libraries of them. Teams spend weeks curating them.

Most of the time, the resulting system is harder to operate, harder to secure, and harder to reason about than what they started with.

A document is not a capability

When teams say “skills” they usually mean domain knowledge the agent can draw on, instructions that shape how it behaves, or a callable capability it can invoke. In practice most implementations bundle all of it into a single markdown file — and often add operational commands like CLI calls and bash scripts on top. Three things that need different homes, collapsed into one artefact. The result is too rigid to be good retrieval, too diffuse to be purposeful instructions, and too permissive to be safe.

The security boundary no one names

Many skill files contain operational instructions. Not conceptual guidance. Literal invocations: run this CLI, execute this bash script, install this dependency.

That made sense for developer tooling. Claude Code, Cursor, local agents running inside a sandbox the developer controls. Fine.

Production is a different context. A production agent has access to real infrastructure, real data, real APIs. When it can execute shell commands because a skill file said to, you have not built an intelligent assistant. You have built a privileged process with a natural language interface and no audit trail.

The argument against this is usually “we control which skills get loaded.” That is a brittle guarantee. Skills get updated, copied, and misrouted. The blast radius of a confused agent that can run shell commands is categorically different from one constrained to defined tool contracts.

Production agents should have no ambient capability. Every action is a named tool with typed inputs, typed outputs, and defined failure modes.

If you cannot express a capability as a function signature, it does not belong in a production agent.

The design problem

The skills framing also does something damaging upstream: it gives anyone designing these systems the wrong unit.

When you think in skills, you think in nouns. What skills does this agent have? What skills does it need? You build a skill inventory and treat it like a feature list.

What they are not thinking about is prompt composition, context strategy, or what the agent needs to know at each decision point. The result is over-specified, under-targeted agents loaded with everything they might ever need, behaving inconsistently because nothing clarifies what applies when.

The better question is not “what skills does the agent have” but “what does it need to know at this step, and where does that come from.” That is a context engineering question. It leads to very different architecture.

What you actually need

The useful parts of the skills concept decompose into three things that should be separate:

Diagram showing a skill file decomposing into three separate concerns: tool contracts, retrieval surfaces, and context composition.

Tool contracts define what an agent can do. Function signatures, typed inputs, typed outputs, defined failure modes. These belong in code with versioning and access control, not in markdown files.

Retrieval surfaces provide context on demand. When an agent needs domain knowledge, it queries for it based on current task state, not because a skill file was preloaded at startup.

Context composition is the actual design problem. What information is in scope at each step? What gets dropped between agents? These are architecture decisions, not content decisions.

What this might look like in practice

In practice the knowledge side could just be a folder structure or repository to begin with, retaining the elegance of skills as a file structure:

context-repo/
├── agents/
│   ├── advisor.md
│   └── researcher.md
└── knowledge/
    ├── use-cases/
    │   ├── cost-reduction.md
    │   ├── market-expansion.md
    │   ├── operational-efficiency.md
    │   └── digital-transformation.md
    ├── methodology/
    │   ├── diagnostic-framework.md
    │   ├── recommendation-patterns.md
    │   ├── stakeholder-mapping.md
    │   └── business-case-structure.md
    ├── industry/
    │   ├── financial-services.md
    │   ├── manufacturing.md
    │   └── retail.md
    └── reference/
        ├── industry-benchmarks.md
        ├── common-objections.md
        └── glossary.md

agents/ holds scoped system prompts: eg. the advisor facing the user, the researcher pulling benchmarks and comparables.

knowledge/ is a big part of where your product’s value lives. The LLM already knows how the world works in general terms. What it does not know is your domain, your methodology, your customers, your context. That gap is what you are encoding here. Nothing in this repository is executable. Tool contracts are defined in code; typed, versioned, with enforced interfaces. That is where they belong.

For this context to be available for progressive discovery two retrieval tools are enough to make the knowledge layer work:

list_context(path) returns frontmatter from files under a path, without loading content
get_context(path) returns the full content of a specific file

These are not capability tools. They give the agent access to knowledge, not access to systems.

The agent scans structure first, fetches only what the current task requires. When a user starts a cost-reduction workflow, the agent queries knowledge/use-cases/ first, then pulls methodology only if the task requires it. No preloading. No bloated context window.

Progressive discovery depends on the agent asking the right questions against the right context. If that fails, create explicit context composition at initialization of an agent. The engineer decides what is in scope, not the agent. Optionally you could add folder level summaries (summary.md) to add hierarchy to your knowledge structure with a specific tool to discover these knowledge paths (per their summaries).

This is not a drop-in replacement for a skill library. It is a decomposition of what skill libraries were trying to do into three separate problems: knowledge retrieval, context composition, and tool contracts. Each solved cleanly in its own layer.

As the system grows, the same primitives extend and can involve more complex systems to compose your context (RAG/Graphs/RDB).

You don’t need skill libraries to build agentic products. In fact they may prevent you from designing the right system.