Testing

devtu-self-evolve - Claude MCP Skill

Orchestrate the full ToolUniverse self-improvement cycle: discover APIs, create tools, test with researcher personas, fix issues, optimize skills, and push via git. References and dispatches to all other devtu skills. Use when asked to: run the self-improvement loop, do a debug/test round, expand tool coverage, improve tool quality, or evolve ToolUniverse.

SEO Guide: Enhance your AI agent with the devtu-self-evolve tool. This Model Context Protocol (MCP) server allows Claude Desktop and other LLMs to orchestrate the full tooluniverse self-improvement cycle: discover apis, create tools, test with res... Download and configure this skill to unlock new capabilities for your AI workflow.

🌟16 stars • 175 forks
📥0 downloads

Documentation

SKILL.md
# ToolUniverse Self-Evolution Orchestrator

Coordinates the full development lifecycle by dispatching to specialized devtu skills.

## The Cycle

```
Discover → Create → Test → Fix → Optimize → Ship → Repeat
```

Each phase maps to a dedicated skill:

| Phase | Skill | What it does |
|-------|-------|-------------|
| **Discover** | `devtu-auto-discover-apis` | Gap analysis, web search for APIs, batch discovery |
| **Create** | `devtu-create-tool` | Build tool class + JSON config + test examples |
| **Test** | *(this skill)* | Launch researcher persona agents to find issues |
| **Fix** | `devtu-fix-tool` | Diagnose failures, implement fixes, validate |
| **Optimize** | `devtu-optimize-skills` | Improve skill reports, evidence handling, UX |
| **Optimize** | `devtu-optimize-descriptions` | Improve tool JSON descriptions for clarity |
| **Docs** | `devtu-docs-quality` | Validate documentation accuracy |
| **Ship** | `devtu-github` | Branch, commit, push, create PR |

## Quick Start

Pick an entry point based on what's needed:

- **"Run a test round"** → jump to [Testing Phase](#testing-phase)
- **"Expand coverage"** → invoke `Skill(skill="devtu-auto-discover-apis")`
- **"Create a new tool"** → invoke `Skill(skill="devtu-create-tool")`
- **"Fix a broken tool"** → invoke `Skill(skill="devtu-fix-tool")`
- **"Improve skills"** → invoke `Skill(skill="devtu-optimize-skills")`
- **"Full cycle"** → follow all phases below in order

---

## Phase 1: Discovery (optional)

Invoke `Skill(skill="devtu-auto-discover-apis")` to:
1. Run gap analysis on current tool categories
2. Search for life science APIs in underrepresented domains
3. Score and prioritize APIs by coverage, reliability, documentation

## Phase 2: Tool Creation (optional)

Invoke `Skill(skill="devtu-create-tool")` for each new API:
1. Create Python tool class implementing the API
2. Create JSON config with parameters, descriptions, test examples
3. Register in `_lazy_registry_static.py` and `default_config.py`
4. Validate: `python -m tooluniverse.cli test <ToolName>`

## Phase 3: Testing Phase

This is the core testing loop, run directly by this skill.

### Setup

1. Check for open PRs: `gh pr list --state open`
2. If unmerged PR → use that branch; if merged → new branch from `origin/main`
3. Rebase: `git fetch origin && git rebase origin/main`

### Researcher Persona Agents

Launch 2 agents per round (A + B) using the Agent tool with these parameters:

**Each agent gets:**
- Domain specialty (oncology, genomics, pharmacology, etc.)
- Research question (specific biological question)
- 5-7 test scenarios exercising different tools
- Instructions to report issues with severity (HIGH/MEDIUM/LOW)
- Issue IDs: `Feature-{round}{letter}-{num}` (e.g., `Feature-59A-001`)

**Agent prompt template** — see [references/persona-template.md](references/persona-template.md)

### Verification (CRITICAL)

Before implementing ANY agent-reported issue, verify via CLI:
```bash
python3 -m tooluniverse.cli run <ToolName> '<json_args>'
```

**50%+ of agent reports are false positives** from MCP interface confusion. Only fix verified issues.

### Fix Principles

1. **Prevent, don't recover** — fix root cause, not symptoms
2. **Validate at input** — reject bad params early with clear guidance
3. **Distinguish "no data" from "bad query"** — different messages for each
4. **Fix the abstraction** — don't add alias lists that grow forever

Anti-patterns: hint text instead of validation, parameter aliases instead of fixing naming, post-hoc probing instead of pre-validation.

## Phase 4: Fix & Commit

1. Implement verified fixes (see [references/bug-patterns.md](references/bug-patterns.md) for code-level patterns)
2. **Run code-simplifier**: `Skill(skill="simplify")` — always after writing or modifying code
3. Lint: `ruff check src/tooluniverse/<file>.py`
4. Verify syntax: `python -c "from tooluniverse.<module> import <Class>"`
5. Test: `python -m tooluniverse.cli run <Tool> '<json>'`
6. Pre-commit hook pattern: stage → commit (fails, reformats) → re-stage → commit
7. Push: `git push origin <branch>`

> Also see `Skill(skill="devtu-code-optimization")` for reusable fix patterns and anti-patterns.

## Phase 5: Optimize (optional)

After fixes are stable:
- `Skill(skill="devtu-optimize-descriptions")` — improve tool descriptions
- `Skill(skill="devtu-optimize-skills")` — improve research skill quality
- `Skill(skill="devtu-docs-quality")` — validate docs accuracy

## Phase 6: Ship

Invoke `Skill(skill="devtu-github")` or manually:
1. Rebase: `git fetch origin && git stash && git rebase origin/main && git stash pop`
2. `git push --force-with-lease origin <branch>`
3. Create or update PR: `gh pr create` / verify with `gh pr view <N> --json mergeable`
4. Verify `"mergeable": "MERGEABLE"` before reporting done

**GitHub repo**: `mims-harvard/ToolUniverse` — always verify with `git remote -v` before pushing.

---

## Git Rules (CRITICAL)

- **NEVER push to main** — all work on feature branches
- **NEVER have multiple open fix PRs** — keep adding to current branch
- **Always rebase before push**: `git fetch origin && git rebase origin/main`
- **Commit message format**: no "BUG" terminology, use "Feature" or "Fix"
- **No AI attribution** in commits

## Common Issue Categories

| Category | Signal |
|----------|--------|
| Silent parameter miss | Wrong-field check; param ignored |
| Always-fires conditional | `.get("field")` on wrong type |
| Silent normalization | Auto-transform not disclosed |
| Wrong notation/case | Gene fusions, Title Case names |
| Substring match | Short symbol returns multiple targets |
| try/except indent | Mismatched → SyntaxError |

Full patterns → [references/bug-patterns.md](references/bug-patterns.md)

Signals

Avg rating0.0
Reviews0
Favorites0

Information

Repository
mims-harvard/ToolUniverse
Author
mims-harvard
Last Sync
3/13/2026
Repo Updated
3/13/2026
Created
3/6/2026

Reviews (0)

No reviews yet. Be the first to review this skill!