OpenAI now helps you transcribe using “Whisper”


OpenAI has recently launched its speech-to-text model, Whisper, through an API. The API will enable third-party developers to integrate the AI-powered speech-to-text model into their own applications and services. 


With a cost of $0.006 per minute, the open-source model is designed to support transcription and translation of audio in multiple languages into English, accepting audio files in a variety of formats such as M4A, MP3, MP4, MPEG, MPGA, WAV, and WEBM.


The model is unique in that it has been trained on over 680,000 hours of multilingual audio content, allowing it to provide robust transcription services in multiple languages. Also, it can be run on users' own hardware without any cost, whereas OpenAI is offering a hosted version of the model through its API.

The Whisper API was introduced alongside the release of ChatGPT. OpenAI claims that its new model, gpt-3.5-turbo, is the best model for non-chat use cases. While ChatGPT has already been integrated into applications like SnapChat's My AI feature, OpenAI is highlighting that the model is useful for more than just chatbots.


OpenAI has set the price of the ChatGPT API at 1,000 tokens for $0.002, making it significantly cheaper than its existing language models. The company has made a series of system-wide optimizations that have allowed it to make the pricing more competitive. While developers can use the ChatGPT API to create AI-powered chat interfaces, they can also use it for other non-chat use cases.


The new API also includes significant policy changes, such as a shift from an opt-out to an opt-in system for using customer data. The policy change means that OpenAI will not use data submitted through the API to train its models without explicit consent from customers.


The introduction of ChatGPT API is expected to open the floodgates for developers who are looking to integrate AI-powered chatbots into their applications. While many companies are working on their own AI chatbot models, the cost of developing such models is out of reach for most developers. OpenAI's ChatGPT API could be the solution to this problem.


Whisper, on the other hand, is a significant tool for developers who require speech-to-text functionality in their applications. The model's ability to transcribe and translate audio content in multiple languages could help organizations to develop new applications and services for a global audience.