Speeding up agentic workflows with WebSockets in the Responses API

OpenAI introduces WebSocket-based optimizations for Codex agent workflows, reducing API overhead and improving model latency by up to 40%.

OpenAI has announced significant performance improvements for its Codex agent workflows through the implementation of WebSockets in the Responses API. This enhancement addresses a key bottleneck in agentic systems by reducing API overhead and improving model response latency, making automated workflows more efficient and scalable.

Reducing Latency Through Connection-Scoped Caching

The new approach leverages WebSockets to maintain persistent connections between clients and OpenAI's servers, enabling real-time communication without the overhead of repeated HTTP requests. According to OpenAI's research, this method incorporates connection-scoped caching that stores intermediate results and context, eliminating redundant computations and significantly cutting down on latency.

Enhanced Agent Loop Performance

The implementation specifically targets the Codex agent loop, which is crucial for complex automated workflows involving multiple API calls and decision-making processes. By streamlining this loop, OpenAI reports that developers can now execute agentic workflows up to 40% faster than before. This improvement is particularly valuable for applications requiring rapid response times, such as chatbots, automated customer service systems, and real-time data processing tools.

Implications for Developers and Enterprises

The update positions OpenAI's platform as a more competitive option for businesses relying on complex AI workflows. Developers building sophisticated agent-based applications will benefit from reduced infrastructure costs and improved user experience. This advancement also reflects a broader industry trend toward optimizing AI systems for real-time performance, as organizations increasingly demand responsive and efficient automated solutions.

The improvements mark a significant step forward in making AI-powered agent systems more practical for enterprise deployment, where latency and efficiency are critical factors in system success.

Speeding up agentic workflows with WebSockets in the Responses API

Reducing Latency Through Connection-Scoped Caching

Enhanced Agent Loop Performance

Implications for Developers and Enterprises

Related Articles

Over half the music uploaded to Deezer is now AI, and most of it is fraud

Databricks’ cofounder just raised $20M to be the Switzerland of AI compute

Halliday made 2025’s worst smart glasses. Its camera-free comeback is aimed at your meetings.