GPT-5.5 API Pricing Guide 2026: Cost, Cached Input & Long-Context Tiers


GPT-5.5 API pricing on EvoLink is $4.00 per 1M input tokens$24.00 per 1M output tokens, and $0.40 per 1M cached input tokens. For sessions above 272K input tokens, long-context pricing applies at $8.00 input and $36.00 output per 1M tokens.

This guide focuses only on GPT-5.5 pricing. If you want the full GPT family comparison, use the broader GPT-5 API pricing comparison.

Pricing note: The GPT-5.5 numbers in this article use EvoLink listed pricing as of April 26, 2026. OpenAI public pricing should be checked separately before quoting any value as an OpenAI direct rate.

GPT-5.5 API Pricing Table
















































Billing item EvoLink price Notes
Standard input $4.00 / 1M tokens Prompt, system instructions, conversation history, and other input text
Output $24.00 / 1M tokens Visible answer tokens plus reasoning tokens when applicable
Cached input $0.40 / 1M tokens Reused prompt/context segments billed at a lower rate
Long-context input $8.00 / 1M tokens Applies when input exceeds 272K tokens
Long-context output $36.00 / 1M tokens Applies in the same long-context session
Context window 1M tokens Use long-context pricing rules when large prompts cross the threshold
Max output 128K tokens Output budget, not a guaranteed response length


The most important pricing rule is the 272K threshold. GPT-5.5 can support a 1M-token context window, but very large prompts can move the whole session into the long-context rate.



How GPT-5.5 Billing Works


GPT-5.5 billing has three main token categories: input, output, and cached input.



Input tokens are the tokens you send to the model. They include your user prompt, system message, prior conversation, retrieved documents, code snippets, and tool instructions.

Output tokens are the tokens generated by the model. For reasoning models, output can include reasoning tokens in addition to visible answer text, depending on the API response and model configuration.

Cached input tokens are repeated input segments that can be billed at a lower rate. Caching matters most when your product sends the same system prompt, policy block, tool description, documentation pack, or conversation scaffold again and again.

Cached Input Example


Suppose your application sends a stable 50K-token instruction and documentation block.
























Request type Calculation Cost
First uncached request 50K x $4.00 / 1M $0.20
Later cached request 50K x $0.40 / 1M $0.02


That difference is why stable prompt design matters. Keep reusable instructions identical across requests and place long, stable context where it can be reused consistently.



Long-Context Pricing Above 272K Tokens


GPT-5.5 has a large context window, but long-context prompts need a separate cost plan. On EvoLink, when the input exceeds 272K tokens, the long-context rate is:
























GPT-5.5 tier Input Output
Standard pricing $4.00 / 1M $24.00 / 1M
Long-context pricing $8.00 / 1M $36.00 / 1M


The long-context rate applies to the session, not only to the tokens above 272K. If you send 300K input tokens, all 300K input tokens are priced at the long-context input rate.



Long-Context Cost Example


Here is a 300K input / 20K output request:





























Line item Calculation Cost
Input 300K x $8.00 / 1M $2.40
Output 20K x $36.00 / 1M $0.72
Total $2.40 + $0.72 $3.12


If the same request were below the long-context threshold, the equivalent standard-rate cost would be $1.68. That does not mean you should always chunk aggressively; it means you should decide whether one full-context request is worth the higher price.



Example GPT-5.5 API Costs


Use these examples as planning estimates. Your real bill depends on prompt length, output length, cache hit rate, retries, and whether reasoning tokens are generated.





































Scenario Input Output Rate used Estimated cost
Customer support answer 2K 500 Standard $0.020
Code review task 20K 5K Standard $0.200
Repository analysis 300K 20K Long-context $3.120


The cost math:




  • 2K input + 500 output = (2,000 x $4 / 1M) + (500 x $24 / 1M) = $0.020

  • 20K input + 5K output = (20,000 x $4 / 1M) + (5,000 x $24 / 1M) = $0.200

  • 300K input + 20K output = (300,000 x $8 / 1M) + (20,000 x $36 / 1M) = $3.120


GPT-5.5 vs GPT-5.4 Pricing


GPT-5.5 is the premium GPT route. GPT-5.4 is the lower-cost flagship route. This section is intentionally short because a full model comparison should live in a separate GPT-5.5 vs GPT-5.4 article.






























Model Input Output Cached input Context
GPT-5.5 $4.00 / 1M $24.00 / 1M $0.40 / 1M 1M
GPT-5.4 $2.00 / 1M $12.00 / 1M $0.20 / 1M 1.05M


Use GPT-5.4 when you need long context at a lower price. Test GPT-5.5 when the task is reasoning-heavy, quality-sensitive, or expensive to retry.



When Is GPT-5.5 Worth the Cost?


GPT-5.5 is not the default choice for every request. It is best used where the task value justifies premium pricing.



Good Fits



  • Complex reasoning where wrong answers are expensive

  • Full-codebase analysis, architecture review, and multi-file debugging

  • Research synthesis across many documents

  • Agent workflows where planning quality reduces retries

  • High-value outputs that need fewer manual corrections


Poor Fits



  • Simple classification

  • Bulk summarization

  • Lightweight extraction

  • Low-margin content generation

  • Prototyping where a cheaper model is good enough


The practical rule is simple: use GPT-5.5 when better reasoning can reduce failures, retries, or human review. Use cheaper GPT routes when the task is routine.



How to Reduce GPT-5.5 API Cost


1. Cache Stable Prompts


Keep reusable system prompts, policies, tool descriptions, and documentation blocks stable. Cached input is $0.40 / 1M tokens instead of $4.00 / 1M.



2. Route Simple Work Elsewhere


Do not send every request to GPT-5.5. Use lower-cost GPT routes for simple tasks, and reserve GPT-5.5 for escalation or high-value reasoning.






def select_model(task_complexity: str) -> str:
if task_complexity == "simple":
return "gpt-5.1"
if task_complexity == "standard":
return "gpt-5.2"
if task_complexity == "long_context":
return "gpt-5.4"
return "gpt-5.5"




3. Avoid Unnecessary Long-Context Requests


If your prompt is near 272K input tokens, check whether retrieval, summarization, or chunking can reduce the request without hurting answer quality.



4. Track Cost Per Successful Task


Cost per token is only one metric. Track retries, validation failures, human review time, latency, and final success rate. A more expensive model can be cheaper if it avoids repeated failed attempts, but that has to be measured in your own workflow.



5. Use GPT-5.5 as an Escalation Route


One common pattern is to start with GPT-5.2 or GPT-5.4 and escalate to GPT-5.5 only when validation fails, confidence is low, or the user requests a deeper pass.



FAQ


How much does GPT-5.5 API cost?


GPT-5.5 costs $4.00 per 1M input tokens, $24.00 per 1M output tokens, and $0.40 per 1M cached input tokens on EvoLink. Long-context pricing above 272K input tokens is $8.00 input and $36.00 output per 1M tokens.



What is GPT-5.5 cached input pricing?


GPT-5.5 cached input pricing on EvoLink is $0.40 per 1M tokens. Cached input is useful when your application repeats stable instructions, documentation, tool definitions, or conversation scaffolds.



What happens above 272K input tokens?


When input exceeds 272K tokens, GPT-5.5 uses long-context pricing on EvoLink: $8.00 per 1M input tokens and $36.00 per 1M output tokens. The long-context rate applies to the session.



Is GPT-5.5 more expensive than GPT-5.4?


Yes. GPT-5.5 is priced higher than GPT-5.4. GPT-5.5 is $4.00 / $24.00 per 1M input/output tokens on EvoLink, while GPT-5.4 is $2.00 / $12.00.



Is GPT-5.5 worth it for coding?


GPT-5.5 is worth testing for complex coding tasks such as multi-file debugging, repository analysis, architecture review, and agentic coding workflows. For simple code completion or small edits, a lower-cost GPT route may be more efficient.



Can I use GPT-5.5 with an OpenAI-compatible API?


Yes. EvoLink provides an OpenAI-compatible integration path, so most teams can migrate by changing the base URL, API key, and model value.






from openai import OpenAI

client = OpenAI(
api_key="your-evolink-api-key",
base_url="https://api.evolink.ai/v1"
)

response = client.chat.completions.create(
model="gpt-5.5",
messages=[
{"role": "user", "content": "Summarize the main risks in this codebase."}
]
)




Where can I compare GPT-5.5 with other GPT models?


Use the GPT model family page for the broader model lineup, or read the GPT-5 API pricing comparison for GPT-5.5, GPT-5.4, GPT-5.2, and GPT-5.1 pricing in one table.

Start With GPT-5.5 Pricing, Then Test on Your Own Tasks


GPT-5.5 is a premium route, so the right question is not only "How much does it cost per token?" The better question is "What does it cost per successful task?"


Start with a small test set, measure retries and review time, compare GPT-5.5 against GPT-5.4 or GPT-5.2, and reserve GPT-5.5 for the workflows where it changes the outcome.



Leave a Reply

Your email address will not be published. Required fields are marked *