Logo
Agentailor

· 11 min read

Top Agent Skills for Agentic Engineering (and Spec-Driven Development)

Frontier coding agents are capable but non-deterministic. These are the agent skills for agentic engineering and spec-driven development that keep their work durable across sessions.

avatarAli Ibrahim@ialijr/

Copy a command, then paste it into the command palette (Ctrl K to open).

/summarizeget a summary of this article
/find <topic>explore related posts

Frontier coding agents are good. The catch is that they're non-deterministic: the architectural decision a model nails today, it may quietly make differently tomorrow. Every session, the agent re-derives the choices embedded in your code (which transport, which boundaries, where state lives) because they were generated, not recorded. For a demo that's fine. For a system you run at scale, that variance is the liability.

Agentic engineering is the answer: you own the architectural decisions and make the project hold onto them, so a capable but forgetful agent stays aligned instead of starting cold each time. The agent handles implementation. You handle architecture, and you write it down where the next session can read it. (I unpacked this discipline in Agent Briefings Issue 16.)

Agent Skills are how you make that discipline portable. They're folders containing a SKILL.md file with instructions, workflows, and references that your agent loads only when relevant. Instead of re-explaining your standards every session, you install a skill once and the agent pulls in the right scaffolding when the task calls for it. For a deep dive on how skills work, see How to Build and Deploy an Agent Skill from Scratch.

The skills below are drawn from the most-installed repos in the ecosystem, each picked as the best option for its slot regardless of who maintains it. They all work across Claude Code, Cursor, VS Code Copilot, Codex, and Gemini CLI. To install any skill, run:

npx skills add <owner/repo> --skill <skill-name>

Treat this as a menu, not a checklist. You don't need all of them. Install one or two, see how they change the way your agent works, and keep the ones that earn their place. Some people will live in spec-driven development and never touch the TDD skill; others will want the full review loop. The bonus at the end, Spec Kit, is a complete workflow on its own that needs none of the others. The point is to play with them and find your own stack.

A note on vibe coding

None of this is for vibe coding. If you're throwing together a prototype, a one-off script, or exploring a problem you don't understand yet, skip the whole list. The upfront cost buys you nothing on work you'll delete tomorrow, and the looser "describe it, look at it, adjust" loop is the right tool there. Explore loosely, harden deliberately. Reach for these skills when the work has to outlast the session and survive production.


1. spec-driven-development

The anchor skill, and a good first one to try. It enforces a gated workflow with a human review step between each phase: Specify (write a spec covering objective, commands, structure, code style, testing, and boundaries), Plan (map components, dependencies, order, and risks), Tasks (break the plan into discrete items with acceptance criteria), and Implement (execute the tasks). The spec lives in version control as a document that gets updated as decisions change.

Why it matters: "Code without a spec is guessing." The single biggest cause of agent-generated code that misses the mark is an underspecified request. This skill forces the agent to surface assumptions and reframe vague requirements into concrete success criteria before it writes a line. It's how the decisions you'd otherwise re-explain every session move out of the chat and into a document the project keeps.

Best for: Any non-trivial feature or new project where the cost of building the wrong thing is higher than the cost of writing it down first.

npx skills add addyosmani/agent-skills --skill spec-driven-development

GitHub: addyosmani/agent-skills

For why the spec, not the conversation, is the right place to store a decision, see Agent Briefings Issue 17: The Spec Is the Source of Truth.


2. brainstorming

Before the spec comes the design conversation. This skill runs a Socratic refinement: the agent walks through context exploration, clarifying questions, and approach proposals, and enforces a hard gate. No code or scaffolding until a design has been presented and approved. From the obra/superpowers framework (225k+ stars), it's one of the most-installed agentic-engineering skills in the ecosystem.

Why it matters: A capable agent will happily run with the first interpretation of a vague request, and a different interpretation on the next run. Forcing a design phase first turns those assumptions into something you approve on purpose, while they're still cheap to change, instead of discovering them baked into a thousand lines of code. It deliberately applies even to small tasks, because the assumptions that hurt most are the ones nobody thought to question.

Best for: The fuzzy front end of any task, where the right approach is not yet obvious and you want the agent to think with you before it builds.

npx skills add obra/superpowers --skill brainstorming

GitHub: obra/superpowers


3. context-engineering

This skill teaches the agent to feed itself the right information rather than the most information: structuring rules files, pulling in relevant docs, and wiring up MCP integrations so the model works from a curated context instead of guessing or drowning in noise.

Why it matters: Context is greater than model. Who the task is for and what the agent can see matters more than which model you run. The most common failure mode in agent work is not a weak model, it's a model fed irrelevant or incomplete context. This skill makes context a deliberate input, not an afterthought.

Best for: Developers who keep hitting "the agent should have known that" moments and want to make the agent's working context explicit and repeatable.

npx skills add addyosmani/agent-skills --skill context-engineering

GitHub: addyosmani/agent-skills

Context engineering is a recurring theme on this blog. For a production case study, see Engineering Context for Hybrid AI Personas.


4. test-driven-development

Tests are how you make correctness durable instead of session-dependent. This skill enforces RED-GREEN-REFACTOR: write the failing test first, watch it fail, write minimal code to pass. It treats testing-after-the-fact as grounds to delete the implementation and start over, and it names the rationalizations that creep in (manual verification, sunk-cost shortcuts) so they don't slip through. With 123k+ installs from obra/superpowers, it's the most widely-adopted TDD skill in the registry.

Why it matters: Code that worked when the agent ran it once has demonstrated the happy path on that run, nothing more. The next session, working from slightly different context, will refactor it and may quietly break it. Tests written before the implementation pin the intended behavior down so the regressions a non-deterministic agent introduces on its next change get caught instead of shipped. Watching the test fail first is the proof that the test is checking the right thing. This is the phase that separates a demo from something you'd run in production.

Best for: Any code you intend to keep, extend, or trust in production.

npx skills add obra/superpowers --skill test-driven-development

GitHub: obra/superpowers


5. subagent-driven-development

This is where agentic engineering gets structural. The skill dispatches a fresh subagent per task and runs a two-stage review on the result: first a spec-compliance reviewer confirms the implementation matches the requirements, then a code-quality reviewer checks for problems. It manages status signals (DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED) and re-dispatches or escalates accordingly.

Why it matters: A single long-running agent accumulates context rot and drifts off-spec. Isolating each task to a fresh subagent keeps every unit of work focused, and the two-stage review encodes the exact thing agentic engineering is about: checking not just whether the code works, but whether it does what the spec said. It's orchestration as system design rather than a single linear chat.

Best for: Larger features broken into many tasks, where you want each one implemented, reviewed against the spec, and quality-checked in isolation.

npx skills add obra/superpowers --skill subagent-driven-development

GitHub: obra/superpowers


6. source-driven-development

This skill grounds the agent's framework and API decisions in official documentation, with citations, rather than whatever it half-remembers from training. When the agent reaches for an API, it checks the source first and cites where the answer came from.

Why it matters: LLM training data goes stale, and the agent space moves fast. An agent confidently calling a deprecated method or inventing a config option that never existed is one of the most expensive failure modes to debug, because the code looks right. Forcing source-grounded, cited decisions turns "the agent thinks this works" into "the docs say this works."

Best for: Anyone building against fast-moving frameworks, SDKs, or APIs where the agent's training cutoff is older than the library version you're using.

npx skills add addyosmani/agent-skills --skill source-driven-development

GitHub: addyosmani/agent-skills


Bonus: github/spec-kit

The six skills above are composable: install the ones you need and the agent loads them per task. This bonus is the opposite philosophy, and for teams committing fully to spec-driven development it may be the most valuable tool on this list.

Spec Kit is GitHub's official, end-to-end Spec-Driven Development toolkit (112k+ stars, MIT licensed). It provides a complete workflow as a set of commands: /speckit.constitution to set governing principles, then /speckit.specify, /speckit.plan, /speckit.tasks, and /speckit.implement, with optional /speckit.clarify, /speckit.analyze, and /speckit.checklist steps. It supports 30+ coding agents and now ships in skills mode, so you can install it as agent skills rather than slash commands.

Why it matters: Where the spec-driven-development skill above is a lightweight discipline you layer onto your existing workflow, Spec Kit is the heavyweight, opinionated framework that makes the spec the executable source of truth for an entire project. If your team wants one shared, governed methodology rather than à la carte skills, this is it.

Best for: Teams standardizing on spec-driven development across a whole project or organization.

uv tool install specify-cli --from git+https://github.com/github/spec-kit.git
specify init <project> --integration-options="--skills"

GitHub: github/spec-kit


Quick Reference

SkillBest ForCreated By
spec-driven-developmentPinning down what to build, firstCommunity (addyosmani)
brainstormingThe design phase before any codeCommunity (obra)
context-engineeringFeeding the agent the right contextCommunity (addyosmani)
test-driven-developmentProving behavior before shippingCommunity (obra)
subagent-driven-developmentPer-task isolation with spec + quality reviewCommunity (obra)
source-driven-developmentGrounding decisions in official docsCommunity (addyosmani)
spec-kit (Bonus)Full spec-driven workflow, team-wideGitHub

Key Takeaways

  • Agentic engineering is making a capable agent's decisions durable. The agent isn't lazy, it's non-deterministic: brilliant today, different tomorrow. These skills move the design, spec, verification, and review out of the conversation and into artifacts the project keeps, so the agent stays aligned across sessions.
  • Start with the spec. Most agent-generated code that misses the mark was underspecified, not under-modeled. spec-driven-development is the highest-leverage skill on this list.
  • The practices are mature, the SDD branding is still consolidating. Design, test, and review skills have heavily-installed options across multiple independent repos. Spec-driven development by name centers on a couple of canonical sources (addyosmani and GitHub's Spec Kit). That's a signal of where the ecosystem is, not a gap to ignore.
  • Pick from the menu, don't install the whole list. These compose across repos (define with brainstorming and spec-driven-development, ground with context-engineering and source-driven-development, build and verify with test-driven-development and subagent-driven-development), but you don't need all of them. Try a couple, keep what earns its place. Spec Kit alone is enough for some teams.
  • Vet skills before installing. Prefer skills from known maintainers, and for community skills check their detail page on skills.sh: every listed skill displays a Security Audit report so you know what you're installing.

Enjoying content like this? Sign up for Agent Briefings, where I share insights and news on building and scaling AI agents.

Resources