FallbackStrategy gives your agent automatic resilience: when the primary LLM fails with a transient error (rate limit, timeout, connection issue), the SDK tries alternate LLMs in order. Fallback is per-call — each new request always starts with the primary model.
If the call fails with a transient error, the FallbackStrategy kicks in and tries each fallback LLM in order
The first successful fallback response is returned to the caller
If all fallbacks fail, the original primary error is raised
Token usage and cost from fallback calls are merged into the primary LLM’s metrics, so you get a unified view of total spend by model
Only transient errors trigger fallback.
Non-transient errors (e.g., authentication failures, bad requests) are raised immediately without trying fallbacks.
For a complete list of supported transient errors see the source code
"""Example: Using FallbackStrategy for LLM resilience.When the primary LLM fails with a transient error (rate limit, timeout, etc.),FallbackStrategy automatically tries alternate LLMs in order. Fallback isper-call: each new request starts with the primary model. Token usage andcost from fallback calls are merged into the primary LLM's metrics.This example: 1. Saves two fallback LLM profiles to a temporary store. 2. Configures a primary LLM with a FallbackStrategy pointing at those profiles. 3. Runs a conversation — if the primary model is unavailable, the agent transparently falls back to the next available model."""import osimport tempfilefrom pydantic import SecretStrfrom openhands.sdk import LLM, Agent, Conversation, LLMProfileStore, Toolfrom openhands.sdk.llm import FallbackStrategyfrom openhands.tools.file_editor import FileEditorToolfrom openhands.tools.terminal import TerminalTool# Read configuration from environmentapi_key = os.getenv("LLM_API_KEY", None)assert api_key is not None, "LLM_API_KEY environment variable is not set."base_url = os.getenv("LLM_BASE_URL")primary_model = os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929")# Use a temporary directory so this example doesn't pollute your home folder.# In real usage you can omit base_dir to use the default (~/.openhands/profiles).profile_store_dir = tempfile.mkdtemp()store = LLMProfileStore(base_dir=profile_store_dir)fallback_1 = LLM( usage_id="fallback-1", model=os.getenv("LLM_FALLBACK_MODEL_1", "openai/gpt-4o"), api_key=SecretStr(os.getenv("LLM_FALLBACK_API_KEY_1", api_key)), base_url=os.getenv("LLM_FALLBACK_BASE_URL_1", base_url),)store.save("fallback-1", fallback_1, include_secrets=True)fallback_2 = LLM( usage_id="fallback-2", model=os.getenv("LLM_FALLBACK_MODEL_2", "openai/gpt-4o-mini"), api_key=SecretStr(os.getenv("LLM_FALLBACK_API_KEY_2", api_key)), base_url=os.getenv("LLM_FALLBACK_BASE_URL_2", base_url),)store.save("fallback-2", fallback_2, include_secrets=True)print(f"Saved fallback profiles: {store.list()}")# Configure the primary LLM with a FallbackStrategyprimary_llm = LLM( usage_id="agent-primary", model=primary_model, api_key=SecretStr(api_key), base_url=base_url, fallback_strategy=FallbackStrategy( fallback_llms=["fallback-1", "fallback-2"], profile_store_dir=profile_store_dir, ),)# Run a conversationagent = Agent( llm=primary_llm, tools=[ Tool(name=TerminalTool.name), Tool(name=FileEditorTool.name), ],)conversation = Conversation(agent=agent, workspace=os.getcwd())conversation.send_message("Write a haiku about resilience into HAIKU.txt.")conversation.run()# Inspect metrics (includes any fallback usage)metrics = primary_llm.metricsprint(f"Total cost (including fallbacks): ${metrics.accumulated_cost:.6f}")print(f"Token usage records: {len(metrics.token_usages)}")for usage in metrics.token_usages: print( f" model={usage.model}" f" prompt={usage.prompt_tokens}" f" completion={usage.completion_tokens}" )print(f"EXAMPLE_COST: {metrics.accumulated_cost}")
You can run the example code as-is.
The model name should follow the LiteLLM convention: provider/model_name (e.g., anthropic/claude-sonnet-4-5-20250929, openai/gpt-4o).
The LLM_API_KEY should be the API key for your chosen provider.
ChatGPT Plus/Pro subscribers: You can use LLM.subscription_login() to authenticate with your ChatGPT account and access Codex models without consuming API credits. See the LLM Subscriptions guide for details.