personal lab / renos.tk archive

~/renjfk $ open /artifacts/opencode-model-fallback

artifact / active

opencode-model-fallback

Model fallback plugin for OpenCode

TypeScript compiled by Bun to a single ESM bundle. Per-agent state machine intercepts rate-limit and billing errors on natural-language heuristics with ordered fallback chains, 2x exponential backoff, and automatic primary-model restoration.

created
16/05/2026
updated
17/05/2026

> notes

opencode-model-fallback exists for a very specific OpenCode workflow: use the quota that comes with a subscription first, then fall back to an API pay-as-you-go model only when the preferred model is rate-limited or usage-limited.

You can solve that with a local proxy, but maintaining a proxy server often feels too heavy when all you need is a simple one-to-one fallback inside OpenCode.

So I built the routing directly as an OpenCode plugin.

What it does

The plugin maps one preferred model to one fallback model.

When the preferred model hits a retryable provider failure, the plugin aborts the in-flight request, replays the latest user message on the mapped fallback model, and puts the failed model into a global cooldown.

When the cooldown expires, requests route back to the original model.

Why it works this way

The goal is not load balancing, racing providers, or building a full proxy layer. The goal is much narrower: keep working when the model I wanted to use is temporarily blocked by quota, rate limits, or usage limits.

That makes the behavior predictable:

  • Prefer the original model when it is available.
  • Switch to the mapped fallback after a retryable failure.
  • Persist the cooldown so other sessions do not immediately hit the same failed model.
  • Return to the original model after the cooldown expires.
  • Stop at one fallback instead of hiding more complex routing behind the scenes.

The practical reason

When I am using multiple OpenCode sessions, I do not want one provider’s temporary limit to interrupt the flow. I also do not want to pay for every request if a subscription-backed model is still available.

opencode-model-fallback sits between those two needs: use the preferred model first, but keep the session moving when it is temporarily unavailable.