Skills-First Architecture
- Skills-First Architecture
Themis structures every AI agent feature into three layers: skills (domain knowledge), output tools (structured contracts), and workflows (lifecycle orchestration). Skills are the foundation — they carry the judgment, reasoning, and expertise that make agents effective.
The Three Layers
+--------------------------------------------------+
| LAYER 1: SKILL |
| Owns: persona, process, judgment, domain knowledge |
| Sources: file-based (.claude/skills/, lib/skills/) |
| DB-backed (Skill model, 3 scopes) |
| prompts (app/prompts/) |
+--------------------------------------------------+
|
agent calls tool
|
v
+--------------------------------------------------+
| LAYER 2: OUTPUT TOOL |
| Owns: structured data contract, side effects |
| Lives in: app/services/*_tool_builder.rb |
| Built with: ClaudeAgentSDK.create_tool() |
+--------------------------------------------------+
|
returns result
|
v
+--------------------------------------------------+
| LAYER 3: WORKFLOW |
| Owns: lifecycle orchestration only |
| Lives in: app/services/workflows/ |
| Target: under 50 lines |
+--------------------------------------------------+
Layer 1: Skills
Skills own all judgment. They tell the agent what to do, how to reason, and when to use its tools. Themis supports three skill sources that work together to give agents comprehensive domain knowledge.
File-Based Skills (Codebase)
Skills checked into the repository as markdown files with a SKILL.md manifest. Two directories serve different purposes:
.claude/skills/— Themis-internal skills. Architecture guides, coding conventions, review methodology, integration helpers. These stay in the Themis repo and are auto-discovered by the Claude Agent SDK.lib/skills/— Portable skills. Copied into target project worktrees during code generation so the agent carries cross-project standards (e.g.,code-quality/with Rails conventions and security checklists).
Each skill is a directory containing a SKILL.md (YAML frontmatter + markdown) and optional supplementary files:
.claude/skills/understanding-themis/
SKILL.md # Manifest with name, description, content
HOTWIRE.md # Supplementary reference (optional)
DB-Backed Skills (Skill Model)
User-created skills stored in the database with Active Storage file attachments. Managed through the web UI or via agent tools during chat. Three scopes control visibility:
| Scope | Owned By | Visible To | Use Case |
|---|---|---|---|
| System | Admin | All users, all spaces | Organization-wide standards |
| Space | Space | All space members | Team-specific knowledge |
| Personal | User | Owner only | Individual preferences and workflows |
DB skills are extracted to disk by SkillExtractor before each agent run, cached per scope with atomic directory replacement. The agent discovers them through the same .claude/skills/ directory convention.
SkillExtractor.prepare_for_agent(space:, user:)
→ Queries Skill.available_for(user, space)
→ Extracts to cache dirs (system / space / personal)
→ Returns add_dirs array for SDK options
Agents can also create and update their own personal skills during conversation via SkillToolBuilder tools (create_skill, update_skill, list_my_skills).
Agent Prompts
Static and dynamic prompt files that provide workflow-specific instructions:
- Static prompts (
.md) — When the skill needs no runtime context. Example:pr_review.mdwith review process, quality standards, and verdict criteria. - Dynamic prompts (
.md.erb) — ERB templates that inject runtime data. Example:base_agent.md.erbrenders per-space context like agent identity and available channels.
Prompts live in app/prompts/. Load via PromptLoader.load("name") (static) or PromptLoader.render("name", locals) (dynamic).
Prompts define process and judgment but do not describe output format — that responsibility belongs to the output tool schema.
How Skills Reach the Agent
ChatJob / ChannelMentionJob
│
├─ System prompt ← PromptLoader (app/prompts/)
│
└─ add_dirs ← SkillExtractor
├─ File-based skills (.claude/skills/)
└─ DB skills (extracted to cache)
├─ system/
├─ {space_id}/space/
└─ {space_id}/personal/{user_id}/
The Claude Agent SDK scans add_dirs for SKILL.md files and makes them available to the agent automatically. Skills are togglable per space via the feature_skills setting.
Layer 2: Output Tools
Output tools define the structured contract between agent and system. Instead of asking the agent to produce parseable text (fragile), we give it a tool to call with typed arguments.
Wiring tools to callers is the next step after defining them. Each agent caller (full agent, web/API chat, messaging) opts into a set of tool groups via the Tool Catalog — adding a tool to a new caller is one declarative change, not edits across four files.
Tool builders live in app/services/*_tool_builder.rb and use ClaudeAgentSDK.create_tool():
class PRReviewToolBuilder
def self.build_submit_review_tool(review:, space:)
ClaudeAgentSDK.create_tool(
"submit_review",
"Submit your completed code review.",
{
type: "object",
properties: {
verdict: { type: "string", enum: %w[APPROVE REQUEST_CHANGES COMMENT] },
summary: { type: "string", description: "Markdown review summary" },
comments: {
type: "array",
items: {
type: "object",
properties: {
path: { type: "string" },
line: { type: "integer" },
body: { type: "string" }
},
required: %w[path line body]
}
}
},
required: %w[verdict summary]
}
) do |args|
# Side effects: submit to GitHub, update review record
end
end
end
The schema is the format specification. The agent sees it in its tool list and knows exactly what to produce. No prompt budget wasted on output format instructions.
Current Tool Builders
Grouped by purpose. All tools are wired to agent callers via the Tool Catalog.
Workflow output contracts
| Builder | Key Tools | Purpose |
|---|---|---|
PRReviewToolBuilder | get_pr_info, get_pr_diff, get_pr_comments, get_ci_status, submit_review | PR review workflow output |
CodeGenerationResultToolBuilder | submit_code_generation_result | PR metadata from code gen |
AutomationToolBuilder | skip_message | Automation skip decisions |
Triggers (factory-wired)
| Builder | Key Tools | Purpose |
|---|---|---|
PRReviewTriggerToolBuilder | trigger_pr_review | Enqueue a PR review from chat / mention |
CodeGenerationToolBuilder | trigger_code_generation | Enqueue code generation from chat / mention |
Data access
| Builder | Key Tools | Purpose |
|---|---|---|
GithubToolBuilder | get_pr_info, get_pr_diff, get_pr_comments, get_ci_status, list_pull_requests, post_pr_comment | GitHub direct-API access in chat / mention contexts |
ChatHistoryToolBuilder | search_conversations, recall_conversation | On-demand conversation history |
RepoSearchToolBuilder | resolve_repo_path | Browse local git worktrees |
ThemisQueryToolBuilder | query_themis_data | Themis DB queries (editable_by? gate) |
GoogleDriveProxyToolBuilder | proxied Google Drive read tools | Per-user OAuth-scoped Drive access |
Side effects
| Builder | Key Tools | Purpose |
|---|---|---|
SentryToolBuilder | update_sentry_issue | Sentry status + assignment |
MemoryToolBuilder | save_memory, delete_memory | Per-user memory store |
Chat UX
| Builder | Key Tools | Purpose |
|---|---|---|
AskUserQuestionHook (PreToolUse) | AskUserQuestion (native built-in) | Structured clarifying questions — intercepted, not built as an MCP tool |
FileToolBuilder | create_file | Agent-generated file downloads |
ShowWidgetToolBuilder | show_widget | Sandboxed HTML widgets (D3, Mermaid, SVG) |
ShowChartToolBuilder | show_chart | Structured Chart.js rendering |
ImageGenerationToolBuilder | generate_image | Gemini image generation |
Resource management
| Builder | Key Tools | Purpose |
|---|---|---|
SkillToolBuilder | 13 tools — CRUD, file ops, checkout/checkin | Agent-driven skill management |
AutomationChatToolBuilder | create_automation, update_automation, list_my_automations, delete_automation | Agent-driven automation management |
Layer 3: Workflows
The workflow is thin glue. It creates records, starts the agent, handles errors, and updates status. It should contain zero judgment and zero parsing.
All workflows inherit from Workflows::BaseWorkflow and implement #execute. The base class provides #run_agent(prompt:, system_prompt:, model:, max_turns:).
module Workflows
class FeatureWorkflow < BaseWorkflow
def execute(input:)
record = FeatureRecord.create!(input: input, status: "running")
begin
result = run_agent(
prompt: build_prompt(input),
system_prompt: PromptLoader.load("feature_name")
)
record.complete!(result)
rescue => e
record.fail!(e.message)
raise
end
end
end
end
If your workflow is doing regex parsing, JSON extraction, or business logic — something is in the wrong layer.
Decision Framework
| Question | Answer | Layer |
|---|---|---|
| Does it involve judgment, reasoning, or domain knowledge? | Move it to the skill | Skill |
| Does it define a structured data exchange or produce side effects? | Make it an output tool | Output Tool |
| Does it manage record lifecycle, error recovery, or orchestration? | Keep it in the workflow | Workflow |
| Are you parsing LLM free-text into structured data? | You’re doing it wrong | Refactor to Output Tool |
| Are you writing prompt instructions about output format? | The tool schema should handle this | Refactor to Output Tool |
| Is the workflow over 50 lines? | Something is in the wrong layer | Audit and redistribute |
Adding a New Feature
Step 1: Write the skill
Decide where the skill lives based on its purpose:
- Agent prompt (
app/prompts/) — Workflow-specific instructions. Use.mdfor static,.md.erbfor dynamic context. - Codebase skill (
.claude/skills/) — Reusable domain knowledge shared across workflows. - DB skill — User-configurable knowledge managed through the UI.
Focus on: persona, process, judgment criteria, domain knowledge. Do not describe output format.
Step 2: Define the output contract as an SDK tool
Create app/services/feature_tool_builder.rb. The tool schema defines what the agent produces. The handler executes side effects.
class FeatureToolBuilder
def self.build_submit_tool(record:, space:)
ClaudeAgentSDK.create_tool(
"submit_result",
"Submit your analysis results.",
{ type: "object", properties: { ... }, required: %w[...] }
) do |args|
# Execute side effects, return confirmation
end
end
end
Step 3: Write the workflow as thin glue
Create app/services/workflows/feature_workflow.rb. It should only create records, build prompts, call run_agent, handle errors, and update status. Target under 50 lines.
Step 4: Wire into the job
Create app/jobs/feature_job.rb. The job builds options (model, MCP servers, tools, skill dirs), instantiates the workflow, and calls execute. Use the Tool Catalog to opt into tool groups instead of hand-rolling mcp_servers / allowed_tools lists.
class FeatureJob < ApplicationJob
def perform(record_id)
record = FeatureRecord.find(record_id)
options = build_options(record)
workflow = Workflows::FeatureWorkflow.new(options: options)
workflow.execute(record: record)
end
end
Anti-Patterns
| Anti-pattern | Why it’s wrong | Fix |
|---|---|---|
| Parsing JSON from agent free-text | Fragile: breaks on markdown fences, extra text, formatting variations | Define an output tool with typed arguments |
| Prompt instructions about output format | Wastes token budget on rules the agent may ignore; duplicates the contract | The tool schema is the format |
| Business logic in the workflow | Couples orchestration to domain logic; makes workflows fat | Move judgment to skills, data contracts to tools |
| Huge prompts with no ERB | Cannot inject runtime context (user preferences, project settings) | Use .md.erb with locals for dynamic sections |
| Multiple output tools per workflow | Confuses the agent about which tool to call | One primary output tool per workflow |
| Tool handler with complex business logic | Hard to test, tightly coupled to infrastructure | Keep handlers thin: validate, side effect, confirm |
Workflow Maturity
Maturity tracks how cleanly each workflow separates skills, output tools, and orchestration. Tool wiring is now uniform across all workflows via the Tool Catalog regardless of maturity level.
| Workflow | Maturity | Output Contract |
|---|---|---|
| PRReview | High | submit_review SDK tool handles GitHub submission |
| Mention | High | Agent uses MCP tools directly (GitHub, Linear comments) |
| Automation | High | skip_message tool for skip decisions; delivery via AutomationMessageDeliveryService |
| CodeGeneration | Medium | submit_code_generation_result tool exists; PR metadata partially parsed |
Maturity levels:
- High — Prompt owns judgment, tools own contracts, workflow is thin lifecycle glue.
- Medium — Partially follows the pattern. Some parsing or format instructions remain.
- Low — Judgment, parsing, and orchestration tangled in the workflow.