Thinking Steering

LeanCTL classifies every prompt by complexity and adjusts the AI's reasoning budget accordingly. Simple tasks skip thinking entirely. Complex tasks get deep analysis.

How It Works

Every time you send a message, LeanCTL's intent classifier determines the task type (explore, generate, debug, refactor, casual) and maps it to a thinking budget. This happens transparently — the AI receives the optimal thinking parameters for each turn.

The result: you never pay for unnecessary reasoning tokens on simple tasks, but always get thorough analysis on complex ones.

Task classification

The intent classifier analyzes your prompt and assigns one of these task types:

Task Type	Thinking Budget	Example
Casual	Minimal (1K tokens)	"what does this function do?"
Generate	Minimal (1K tokens)	"add a login form"
Explore	Medium (4K tokens)	"how does auth work in this project?"
Review	Medium (4K tokens)	"review this PR for issues"
Test	Trace (8K tokens)	"write comprehensive tests for auth.ts"
Debug / FixBug	Trace (8K tokens)	"fix the failing tests"
Refactor	Deep (16K tokens)	"refactor the database layer"
Config / Deploy	Varies	"set up the CI pipeline"

Casual detection: When a prompt is classified as casual with low confidence and no file targets, LeanCTL disables all tools, caps output to 2,048 tokens, and forces think: false — saving maximum tokens for simple questions.

Provider-Specific Behavior

Thinking Steering adapts to each provider's API:

Anthropic (Claude)

Uses the budget_tokens parameter for extended thinking. Budget values are mapped directly from the task classification.

Budget	Tokens
Minimal	1,024
Medium	4,096
Trace	8,192
Deep	16,384

OpenAI (GPT)

Uses the reasoning_effort parameter for o-series models:

Budget	Effort
Minimal	`low`
Medium	`medium`
Trace / Deep	`high`

Ollama (local models)

Uses the think parameter (boolean). Simple tasks set think: false which can reduce output tokens by up to 95% on supported models (Qwen, DeepSeek-R1).

Budget	Think
Minimal	`false`
Medium+	`true`

Thinking Modes

Control thinking behavior globally via config or per-session with the /thinking command:

Auto (default)

The recommended mode. LeanCTL classifies each prompt and adjusts thinking automatically. You get optimal performance without any manual tuning.

/thinking auto

Always

Forces maximum thinking on every turn. Use when working on complex architectural tasks where you want the AI to reason deeply about every response.

/thinking always

Never

Disables thinking entirely. Maximum speed, minimum token usage. Best for simple code generation tasks where reasoning overhead isn't needed.

/thinking never

Manual

Disables automatic steering. You control thinking via the /thinking command between turns.

/thinking manual

Configuration

Set the default thinking mode in ~/.leanctl/config.toml:

[thinking]
mode = "auto"   # auto | always | never | manual

Override per session in the TUI:

/thinking auto     # switch to auto
/thinking never    # disable thinking for this session

Token Savings

Thinking Steering can save 30–80% on reasoning tokens depending on your workload. The savings come from:

Suppressing thinking on simple tasks — "add a comment", "rename this variable" don't need 16K tokens of reasoning
Right-sizing budgets — debugging gets 8K, not 16K; code generation gets 1K, not 8K
Provider-specific optimization — Ollama's think: false eliminates thinking tokens entirely on simple tasks

View your thinking savings with /cost or /stats in the TUI, or leanctl stats from the command line.

Status Bar Indicator

The TUI status bar shows the current thinking state after each turn:

◇ explore | thinking: auto-steered (think: false)
◇ debug | thinking: auto-steered (budget: 8192 tokens)
◇ generate | thinking: always-on (budget: 16384 tokens)

This helps you understand how thinking was applied to each request, and lets you verify that the auto-classifier is making appropriate decisions.