Text to Speech API

Natural-sounding voice synthesis in 11 Indian languages.

Overview

CallMissed's Text to Speech API converts text into natural-sounding audio in Indian languages. Our API offers two voice engines: Sarvam AI's bulbul:v3 with 39 native Indian voices across 11 languages, and ElevenLabs for premium multilingual voice synthesis. Both are accessible through a single OpenAI-compatible API endpoint.

Unlike generic TTS systems that apply English prosody to Indian languages, our Sarvam voices are trained natively on Indian speech patterns. The result is audio that sounds like a real Indian speaker — with correct intonation, rhythm, and pronunciation for each language.

Voice Engines

Sarvam bulbul:v3 (Recommended for Indian Languages)

39 distinct voices — male and female speakers with varied age and tone profiles
11 Indian languages: Hindi, Tamil, Telugu, Bengali, Marathi, Kannada, Malayalam, Gujarati, Punjabi, Odia, English (Indian)
Telephony-optimized: 8kHz sample rate option for voice calls, reducing bandwidth by 6x without perceptible quality loss
Low latency: First audio chunk delivered in under 200ms for real-time conversational applications
Pricing: $0.18 per 10,000 characters

ElevenLabs (Premium Multilingual)

eleven_multilingual_v2: Highest quality, supports 29 languages including Hindi and English. Best for pre-recorded content, audiobooks, and premium applications. $0.30 per 1,000 characters.
eleven_flash_v2.5: Lower latency variant optimized for real-time use. Good balance of quality and speed. $0.15 per 1,000 characters.

API Reference

Endpoint

POST https://api.callmissed.com/v1/audio/speech

Request Body (JSON)

model — bulbul:v3 | eleven_multilingual_v2 | eleven_flash_v2.5
input — Text to synthesize (max 2,500 characters)
voice — Speaker name (e.g., ritu, shubh, priya, rachel)
language — BCP-47 code (e.g., hi-IN). Required for Sarvam, auto-detected for ElevenLabs.
response_format — mp3 | wav | opus | aac | flac | pcm
speed — Playback speed multiplier (0.5 to 2.0)
speech_sample_rate — 8000, 16000, 22050, 44100, or 48000 Hz (Sarvam only)

Response

Binary audio data in the requested format. Stream directly to a media player or save to file.

Sarvam Voice Catalog

All 39 Sarvam voices are available across all 11 supported languages. Popular voices include:

Ritu: Clear female voice, professional tone. Ideal for IVR, customer support, and information delivery.
Shubh: Warm male voice, conversational style. Great for chatbots and voice agents.
Priya: Energetic female voice with younger tonality. Suited for marketing, announcements, and engaging content.
Aditya: Authoritative male voice with deeper register. Best for formal communications and enterprise applications.
Neha, Rahul, Pooja, Rohan, Simran, Kavya: Additional voices spanning varied ages and tonal profiles.

Use Cases

Voice Agents and IVR

Power real-time AI voice agents with natural-sounding speech. Our telephony-optimized 8kHz output integrates directly with Twilio, Exotel, and other telephony providers. The sub-200ms first-chunk latency means callers hear responses almost instantly — eliminating the awkward silence that makes robotic systems obvious.

Audio Content Creation

Convert blog posts, news articles, and documentation into audio format. Regional news platforms use our API to produce audio versions of articles in Tamil, Telugu, Hindi, and other languages — reaching audiences who prefer listening over reading.

Accessibility

Make your application accessible to visually impaired users by converting screen content to speech in their preferred Indian language. Our multi-language support means accessibility isn't limited to English speakers.

E-Learning and EdTech

Create audio lessons, pronunciation guides, and study material in regional languages. Students hear course content in their mother tongue, improving comprehension and retention. Batch processing lets you convert entire textbooks into audio in minutes.

Notification and Alert Systems

Send voice notifications for critical alerts — delivery updates, appointment reminders, payment confirmations, and emergency broadcasts. Audio messages have higher engagement rates than text, especially for users with limited literacy.

Integration

Our TTS API is OpenAI-compatible. Switch from OpenAI's TTS to CallMissed with two line changes:

Change base URL to https://api.callmissed.com/v1
Use your cm_ API key instead of OpenAI's key

Works with the official OpenAI Python and Node.js SDKs out of the box.

Get Started

Every account gets $5 in free credits and 50 TTS calls per month on the free tier. Create your account and synthesize your first audio in under a minute.