Join our daily and weekly ​newsletters for‌ the latest updates and exclusive⁢ content on‌ industry-leading AI‍ coverage. Learn More


In a marked contrast⁤ to last year’s splashy event, OpenAI held a more subdued⁣ DevDay conference on Tuesday, eschewing major ⁣product launches in favor of incremental ​improvements to its⁣ existing suite of AI tools and APIs.

The company’s focus this ⁤year was‍ on ‌empowering developers⁤ and⁢ showcasing community‍ stories, signaling a ⁢shift in strategy as ⁤the AI landscape becomes increasingly competitive.

The company unveiled four major ​innovations at the event: Vision Fine-Tuning, Realtime API, Model⁢ Distillation, ⁤and Prompt Caching.​ These new tools highlight OpenAI’s strategic pivot ​towards empowering its developer ecosystem rather than competing directly in the‌ end-user application‌ space.

Prompt⁢ caching: A boon for developer budgets

One of the most⁣ significant announcements is the introduction of Prompt Caching, a feature aimed⁣ at reducing costs and latency for developers.

This system automatically applies a 50% discount on input tokens that the model has recently processed, potentially⁣ leading to⁢ substantial savings for applications‌ that frequently reuse context.

“We’ve been pretty⁢ busy,” said Olivier ‍Godement, OpenAI’s head of product for the platform, at a small press conference at the company’s San Francisco headquarters ⁢kicking off ​the developer⁤ conference. “Just⁤ two years ago, GPT-3 ‍was winning.⁣ Now, ⁢we’ve reduced [those] costs ⁢by almost 1000x. I was trying⁣ to come up with an‍ example of technologies who reduced⁣ their​ costs by⁢ almost 1000x in two‍ years—and‍ I‍ cannot‌ come up with an example.”

This dramatic cost reduction presents⁤ a ⁢major‍ opportunity for startups and enterprises ⁣to explore new applications,‌ which were previously out of reach due ​to expense.

A ⁣pricing table from OpenAI’s ‌DevDay 2024 reveals major cost reductions for AI model usage, with cached input tokens offering up to 50% savings ‍compared to uncached tokens⁢ across various GPT models.​ The new o1 model showcases premium⁤ pricing, reflecting its advanced capabilities. (Credit: OpenAI)

Vision fine-tuning: A ⁤new frontier in visual⁣ AI

Another major announcement is the introduction of vision fine-tuning for GPT-4o, OpenAI’s latest large language model. This feature allows developers to⁤ customize the model’s visual understanding capabilities using both images and text.

The implications⁣ of⁣ this update are far-reaching, ​potentially‍ impacting fields such as autonomous ‌vehicles, medical imaging, and visual search functionality.

Grab, a leading Southeast Asian food delivery and rideshare ⁢company, has already ​leveraged this technology to‍ improve its mapping services, according‌ to OpenAI.

Using ‍just 100 examples, Grab reportedly achieved a 20 ⁤percent improvement in lane count ‍accuracy and a 13 ⁢percent boost in ⁢speed limit sign ⁤localization.

This real-world application demonstrates ⁢the possibilities for vision fine-tuning to dramatically enhance AI-powered services across a ⁤wide range‍ of ⁣industries using small ‌batches of visual⁣ training ​data.

Realtime API: Bridging⁢ the gap in conversational ⁤AI

OpenAI