Features How it works Docs Pricing Compare LeanCTX Discord

Thinking Steering

LeanCTL classifies every prompt by complexity and adjusts the AI's reasoning budget accordingly. Simple tasks skip thinking entirely. Complex tasks get deep analysis.

How It Works

Every time you send a message, LeanCTL's intent classifier determines the task type (explore, generate, debug, refactor, casual) and maps it to a thinking budget. This happens transparently — the AI receives the optimal thinking parameters for each turn.

The result: you never pay for unnecessary reasoning tokens on simple tasks, but always get thorough analysis on complex ones.

Task classification

The intent classifier analyzes your prompt and assigns one of these task types:

Task TypeThinking BudgetExample
CasualMinimal (1K tokens)"what does this function do?"
GenerateMinimal (1K tokens)"add a login form"
ExploreMedium (4K tokens)"how does auth work in this project?"
ReviewMedium (4K tokens)"review this PR for issues"
TestTrace (8K tokens)"write comprehensive tests for auth.ts"
Debug / FixBugTrace (8K tokens)"fix the failing tests"
RefactorDeep (16K tokens)"refactor the database layer"
Config / DeployVaries"set up the CI pipeline"
Casual detection: When a prompt is classified as casual with low confidence and no file targets, LeanCTL disables all tools, caps output to 2,048 tokens, and forces think: false — saving maximum tokens for simple questions.

Provider-Specific Behavior

Thinking Steering adapts to each provider's API:

Anthropic (Claude)

Uses the budget_tokens parameter for extended thinking. Budget values are mapped directly from the task classification.

BudgetTokens
Minimal1,024
Medium4,096
Trace8,192
Deep16,384

OpenAI (GPT)

Uses the reasoning_effort parameter for o-series models:

BudgetEffort
Minimallow
Mediummedium
Trace / Deephigh

Ollama (local models)

Uses the think parameter (boolean). Simple tasks set think: false which can reduce output tokens by up to 95% on supported models (Qwen, DeepSeek-R1).

BudgetThink
Minimalfalse
Medium+true

Thinking Modes

Control thinking behavior globally via config or per-session with the /thinking command:

Auto (default)

The recommended mode. LeanCTL classifies each prompt and adjusts thinking automatically. You get optimal performance without any manual tuning.

/thinking auto

Always

Forces maximum thinking on every turn. Use when working on complex architectural tasks where you want the AI to reason deeply about every response.

/thinking always

Never

Disables thinking entirely. Maximum speed, minimum token usage. Best for simple code generation tasks where reasoning overhead isn't needed.

/thinking never

Manual

Disables automatic steering. You control thinking via the /thinking command between turns.

/thinking manual

Configuration

Set the default thinking mode in ~/.leanctl/config.toml:

[thinking]
mode = "auto"   # auto | always | never | manual

Override per session in the TUI:

/thinking auto     # switch to auto
/thinking never    # disable thinking for this session

Token Savings

Thinking Steering can save 30–80% on reasoning tokens depending on your workload. The savings come from:

  • Suppressing thinking on simple tasks — "add a comment", "rename this variable" don't need 16K tokens of reasoning
  • Right-sizing budgets — debugging gets 8K, not 16K; code generation gets 1K, not 8K
  • Provider-specific optimization — Ollama's think: false eliminates thinking tokens entirely on simple tasks

View your thinking savings with /cost or /stats in the TUI, or leanctl stats from the command line.

Status Bar Indicator

The TUI status bar shows the current thinking state after each turn:

◇ explore | thinking: auto-steered (think: false)
◇ debug | thinking: auto-steered (budget: 8192 tokens)
◇ generate | thinking: always-on (budget: 16384 tokens)

This helps you understand how thinking was applied to each request, and lets you verify that the auto-classifier is making appropriate decisions.