All ComparisonsAPIs & Protocols

OpenAI Responses API vs Chat Completions API

A detailed comparison of OpenAI's new Responses API and Conversations API versus the legacy Chat Completions API — covering features, migration path, and why new projects should use the Responses API.

Specs Comparison

Feature	Responses API (New)	Chat Completions API (Legacy)
Endpoint	POST /v1/responses	POST /v1/chat/completions
Released	2025	2023
Context Management	Conversations API (automatic, server-side)	Manual — you send full message history
Conversation Persistence	Built-in — conversations never expire	None — you manage your own database
Web Search	Built-in tool (web_search)	Not available
File Search	Built-in tool (file_search)	Not available (was via Assistants API)
Code Execution	Built-in tool (code_interpreter)	Not available (was via Assistants API)
Function Calling	Yes — same format as Chat Completions	Yes — tools parameter
Streaming	Yes — server-sent events	Yes — server-sent events
Model Support	GPT-5.2, GPT-5-mini, GPT-4.1-mini, and newer models	All OpenAI models including legacy (GPT-4, GPT-3.5)
Response Storage	Automatic with store: true (30-day retention)	None — stateless
Reasoning Control	reasoning.effort parameter (none/medium/high)	Not available

Responses API (New)

Pros

Conversations API eliminates manual history management
Built-in web search — no external integration needed
Built-in file search and code execution tools
Server-side conversation storage (never expires)
Simpler API surface — one endpoint for everything
Reasoning effort control for cost optimization
Designed for agentic workflows

Cons

Newer API — fewer community examples and tutorials
Some parameters differ from Chat Completions (learning curve)
gpt-5-mini does not support temperature parameter
Conversations API adds vendor lock-in for context management

Best for

All new projects. Especially valuable for chat applications (Conversations API), apps needing web search, and agentic workflows with built-in tools.

Chat Completions API (Legacy)

Pros

Mature and battle-tested — extensive documentation
Massive community with abundant examples and libraries
Full control over conversation history and storage
No vendor lock-in for context management
Temperature parameter works on all models
Compatible with OpenAI-compatible providers (Together, Groq)

Cons

Manual conversation history management required
No built-in web search, file search, or code execution
Stateless — every request must include full context
Assistants API (for tools) has been superseded
Will not receive new features — Responses API is the future

Best for

Legacy projects already using Chat Completions, applications using OpenAI-compatible providers, and edge cases requiring full manual control over conversation state.

Verdict

Use the Responses API for all new projects. It provides built-in web search, file search, code execution, server-side conversation management, and is where OpenAI is investing all new features. The Chat Completions API still works and will be supported, but it is effectively in maintenance mode. Migrate existing projects when convenient, prioritizing those that would benefit from Conversations API (chat apps) or built-in tools (search, code execution).

Frequently Asked Questions

Is the Chat Completions API deprecated?

Not officially deprecated, but OpenAI has made clear that the Responses API is the future. New features like built-in web search, Conversations API, and reasoning effort control are only available on the Responses API. Chat Completions will continue working but will not receive major new features.

How do I migrate from Chat Completions to Responses API?

The core change: replace your POST /v1/chat/completions call with POST /v1/responses, change "messages" to "input", and add a "conversation" parameter if using Conversations API. Function calling works the same way. Most migrations take a few hours for simple apps.

What is the Conversations API?

The Conversations API (POST /v1/conversations) creates a server-side conversation that persists indefinitely. When you send a response with a "conversation" parameter, OpenAI automatically manages the message history. This eliminates the need to store and send conversation history yourself — a major simplification for chat applications.

Why does gpt-5-mini not support the temperature parameter?

This is a known limitation of gpt-5-mini specifically. If your application relies on temperature for varied outputs, use gpt-4.1-mini instead — it supports temperature and works with both the Responses API and Chat Completions API. gpt-5.2 supports temperature via reasoning.effort instead.

Related Glossary Terms

GPT

OpenAI's family of generative pre-trained transformer models, the most widely adopted LLMs for commercial AI applications.

Function Calling (Tool Use)

An AI capability where the model can decide to invoke external functions or APIs based on the conversation context.

Streaming

A method of receiving AI model output token-by-token in real time as it is generated, rather than waiting for the complete response.

Conversational AI

AI systems designed for natural, multi-turn dialogue with humans, maintaining context across exchanges and handling follow-up questions naturally.

Tool Use (AI)

The capability of AI models to interact with external tools, APIs, and systems by generating structured function calls based on natural language instructions.

Need help choosing?

AI 4U builds with both Responses API and Chat Completions API. We'll recommend the right tool for your specific use case and build it for you in 2-4 weeks.

Let's Talk