Generate Audio - InstaView API

Converts text to speech using AI-powered voices. Returns an audio stream in MP3 format that can be played, downloaded, or integrated into applications.

Overview

The Text-to-Speech API allows you to generate natural-sounding speech from text in multiple languages and voices. This endpoint is useful for creating voice content, accessibility features, voice assistants, and more.

Use Cases

Voice Assistants: Generate spoken responses for chatbots and virtual assistants
Accessibility: Create audio versions of text content for visually impaired users
Content Creation: Produce voiceovers for videos, podcasts, or presentations
E-Learning: Generate audio narration for educational materials
IVR Systems: Create dynamic voice prompts for phone systems

Basic Usage

{
  "text": "Hello, welcome to InstaView. We're excited to help you conduct AI-powered interviews.",
  "voice": "ALEX",
  "language": "en"
}

Voice Options

The API supports 12 different voices with various characteristics:

Voice Name	Gender	Best For
ALEX	Male	Professional, clear delivery
PETER	Male	Warm, conversational tone
MIRIAM	Female	Professional, authoritative
SUE	Female	Friendly, approachable
VIERA	Female	Clear, neutral tone
CASANDRA	Female	Warm, engaging
SILVIA	Female	Professional, polished
MICHAEL	Male	Deep, authoritative
LUKE	Male	Energetic, dynamic
EMMA	Female	Clear, friendly
SARAH	Female	Warm, professional
EVA	Female	Neutral, versatile

Language Support

The API supports 33 languages. Use the appropriate language code for your text:

Language	Code	Language	Code	Language	Code
English	`en`	Japanese	`ja`	Chinese	`zh`
German	`de`	Hindi	`hi`	French	`fr`
Korean	`ko`	Portuguese	`pt`	Italian	`it`
Spanish	`es`	Indonesian	`id`	Dutch	`nl`
Turkish	`tr`	Filipino	`fil`	Polish	`pl`
Swedish	`sv`	Bulgarian	`bg`	Romanian	`ro`
Arabic	`ar`	Czech	`cs`	Greek	`el`
Finnish	`fi`	Croatian	`hr`	Malay	`ms`
Slovak	`sk`	Danish	`da`	Tamil	`ta`
Ukrainian	`uk`	Russian	`ru`	Hungarian	`hu`
Norwegian	`no`	Vietnamese	`vi`

Response Format

The endpoint returns an audio stream with Content-Type: audio/mpeg. You can:

Stream directly: Play the audio in real-time
Save to file: Download and store the MP3 file
Integrate: Use in applications, websites, or phone systems

Example: Saving Audio (Node.js)

const fs = require('fs');
const response = await fetch('https://api.instaview.sk/v1/public/voice', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Hello, this is a test.',
    voice: 'SARAH',
    language: 'en'
  })
});

if (!response.ok) {
  throw new Error(`API error: ${response.status}`);
}

const buffer = await response.arrayBuffer();
fs.writeFileSync('output.mp3', Buffer.from(buffer));

Example: Playing Audio (Browser)

const response = await fetch('https://api.instaview.sk/v1/public/voice', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Hello, this is a test.',
    voice: 'EMMA',
    language: 'en'
  })
});

if (!response.ok) {
  throw new Error(`API error: ${response.status}`);
}

const blob = await response.blob();
const audioUrl = URL.createObjectURL(blob);
const audio = new Audio(audioUrl);
audio.play();

Example: Python

import requests

response = requests.post(
    'https://api.instaview.sk/v1/public/voice',
    headers={
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
    },
    json={
        'text': 'Hello, this is a test.',
        'voice': 'MICHAEL',
        'language': 'en'
    }
)

response.raise_for_status()

with open('output.mp3', 'wb') as f:
    f.write(response.content)

Text Limitations

The text field has a maximum length of 5,000 characters. For longer content, split the text into multiple requests.

Authentication & Scopes

This endpoint requires the VOICE scope. Ensure your API key has this scope enabled before making requests.

Error Handling

Common errors you may encounter:

400 Bad Request: Invalid voice name or language code
401 Unauthorized: Missing or invalid API key
403 Forbidden: API key lacks VOICE scope
413 Payload Too Large: Text exceeds 5,000 characters
429 Too Many Requests: Rate limit exceeded

Rate Limiting

This endpoint is subject to your API key’s rate limits. Monitor the rate limit headers in the response:

X-RateLimit-Limit: Maximum requests allowed
X-RateLimit-Remaining: Requests remaining in current window
X-RateLimit-Reset: Time when the rate limit resets

API Keys Guide

Learn about API key management and scopes

Rate Limiting

Understand rate limits and best practices

Error Handling

Handle API errors gracefully

Scopes & Permissions

Learn about API key scopes

Authorizations

Authorization

string

header

required

API key for authentication using Bearer scheme

Query Parameters

companyId

string

Required for ATS API keys to specify which company to access. Ignored for standard company API keys.

Body

application/json

text

string

required

The text to convert to speech

Maximum string length: 5000

Example:

"Hello, this is a test."

voice

enum<string>

required

The voice name to use for generation

Available options:

ALEX,

PETER,

MIRIAM,

SUE,

VIERA,

CASANDRA,

SILVIA,

MICHAEL,

LUKE,

EMMA,

SARAH,

EVA

Example:

"ALEX"

language

enum<string>

required

The language of the text

Available options:

en,

ja,

zh,

de,

hi,

fr,

ko,

pt,

it,

es,

id,

nl,

tr,

fil,

pl,

sv,

bg,

ro,

ar,

cs,

el,

fi,

hr,

ms,

sk,

da,

ta,

uk,

ru,

hu,

no,

vi

Example:

"en"

Response

Audio stream generated successfully

The response is of type file.

​Overview

​Use Cases

​Basic Usage

​Voice Options

​Language Support

​Response Format

​Example: Saving Audio (Node.js)

​Example: Playing Audio (Browser)

​Example: Python

​Text Limitations

​Authentication & Scopes

​Error Handling

​Rate Limiting

​Related Resources

API Keys Guide

Rate Limiting

Error Handling

Scopes & Permissions

Authorizations

Query Parameters

Body

Response

Overview

Use Cases

Basic Usage

Voice Options

Language Support

Response Format

Example: Saving Audio (Node.js)

Example: Playing Audio (Browser)

Example: Python

Text Limitations

Authentication & Scopes

Error Handling

Rate Limiting

Related Resources