Skip to content

Clarify that model routing is not process overhead#464

Open
lorenzstorm1 wants to merge 1 commit into
DietrichGebert:mainfrom
lorenzstorm1:fix/token-routing-not-process-overhead
Open

Clarify that model routing is not process overhead#464
lorenzstorm1 wants to merge 1 commit into
DietrichGebert:mainfrom
lorenzstorm1:fix/token-routing-not-process-overhead

Conversation

@lorenzstorm1

Copy link
Copy Markdown

Problem

When a plan already contains complete implementation code (e.g. 500 lines across two files), ponytail's "skip subagents — that's over-process" heuristic prevents dispatching to a cheaper model. The result: the expensive frontier model re-generates pre-written code verbatim, burning ~5x more tokens for zero additional value.

Real-world conflict

The superpowers plugin includes model-routing configuration that assigns plan tasks a tier (mechanical, standard, frontier) and dispatches subagents to the appropriate model. Tasks with complete code in the plan steps are mechanical → routed to Haiku.

Ponytail intercepts this flow: "The plan already has the code, dispatching subagents to copy-paste it would be over-process — just write it directly." This sounds lazy but isn't: Opus outputting 500 lines of pre-written code costs ~5x what Haiku would.

Fix

New bullet in Rules clarifying that model routing ≠ ceremony. The ladder governs what the solution looks like, not which model tier writes the files. When the code is already designed and written in the plan, dispatching to a mechanical-tier model is the lazier (cheaper) path.

Example

Plan contains two complete scripts (~500 lines total), tier: mechanical. Without this fix:

  • Ponytail: "just write it yourself, subagents are over-process" → Opus outputs 500 lines → expensive

With this fix:

  • Ponytail recognizes dispatch-to-Haiku as the lazy choice → Haiku outputs 500 lines → 5x cheaper, same result

🤖 Generated with Claude Code

When a plan already contains complete code, ponytail's "skip subagents,
over-process" heuristic incorrectly prevents dispatching to cheaper models.
Writing 500 lines of pre-written code through an expensive model is wasteful,
not lazy. The new rule clarifies: the ladder governs the solution, not which
model executes it.

🤖 Generated with Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant