The Sarvam Conversational AI SDK is a Python package that helps developers build and extend conversational agents. It provides core components to manage conversation flow, language preferences, and messaging, making it easier to develop interactive and context-aware AI experiences.

Overview

The Sarvam Conv AI SDK enables developers to create tools that can:

Facilitate agentic capabilities like API calling in the middle of a conversation.
Manage agent-specific variables
Control and modify the language used during conversations
Send dynamic messages to both the user and the underlying language model (LLM)

Installation

Basic Installation

Install the SDK via pip:

pip install sarvam-conv-ai-sdk

Audio Support (Optional)

If you want to use audio streaming features (microphone input and speaker output), you need to install PyAudio. This requires system-level dependencies:

Option 1: Install with audio support

pip install sarvam-conv-ai-sdk[all]

Note: You’ll need to install PortAudio first:

macOS: brew install portaudio
Ubuntu/Debian: sudo apt-get install portaudio19-dev
Windows: Download from http://www.portaudio.com/download.html

Option 2: Use without PyAudio

The SDK works without PyAudio for non-playback environments; audio capture/playback features will not be available. You can still:

Use the WebSocket client for real-time voice conversations (provide your own audio I/O)
Build backend proxies for frontend applications

AsyncSamvaadAgent

Build real-time voice with a small set of inputs.

You provide InteractionConfig: who the user is, which app to talk to, interaction type, and audio sample rate; optionally include overrides like agent_variables and initial language/state.
You create AsyncSamvaadAgent with your API key, config, and optional audio interface plus callbacks for text/audio/events.
Start the agent: it fetches a signed WebSocket URL, sends interaction_start, and streams audio/text both ways.

Key features

Real-time voice interaction — natural speak and listen
Automatic audio management — built-in microphone input and speaker output
Async/await support — non-blocking operations
Callback handling — process text/audio/events asynchronously
Connection management — robust WebSocket handling

Minimal example:

import asyncio
from pydantic import SecretStr
from sarvam_conv_ai_sdk import AsyncSamvaadAgent, AsyncDefaultAudioInterface, InteractionConfig, InteractionType, ServerTextChunkMsg, SarvamToolLanguageName
from sarvam_conv_ai_sdk.messages.types import UserIdentifierType

async def handle_text(msg: ServerTextChunkMsg):
    print("Agent:", msg.text)

async def main(app_id: str, api_key: str):
    config = InteractionConfig(
        user_identifier_type=UserIdentifierType.CUSTOM,
        user_identifier="demo_user",
        org_id="org_ai",
        workspace_id="workspace_id",
        app_id=app_id,
        interaction_type=InteractionType.CALL,
        agent_variables={"agent_variable_1": "value"},
        initial_language_name=SarvamToolLanguageName.HINDI,
        sample_rate=16000,
    )

    agent = AsyncSamvaadAgent(
        api_key=SecretStr(api_key),
        config=config,
        audio_interface=AsyncDefaultAudioInterface(input_sample_rate=16000),
        text_callback=handle_text,
    )

    await agent.start()
    try:
        # Wait until the WebSocket disconnects or the agent is stopped
        await agent.wait_for_disconnect()
    finally:
        await agent.stop()

if __name__ == "__main__":
    asyncio.run(main(app_id="your_app_id", api_key="your_api_key"))

AsyncSamvaadAgent parameters

Parameter	Type	Required	Description
api_key	SecretStr	Yes	API key used to fetch a signed WebSocket URL
config	InteractionConfig	Yes	Interaction start configuration (user id, app id, sample rate, overrides)
audio_interface	AsyncAudioInterface or None	No	Automatic mic capture and speaker playback. Omit for headless usage (use `send_audio`)
text_callback	Callable[[ServerTextChunkMsg], Awaitable[None]] or None	No	Receives streaming text chunks from the agent
audio_callback	Callable[[ServerAudioChunkMsg], Awaitable[None]] or None	No	Receives audio chunks if not using `audio_interface` for playback
event_callback	Callable[[ServerEventBase], Awaitable[None]] or None	No	Receives events like interaction_connected, user_interrupt, interaction_end
base_url	str	No	Override base URL. Default: `https://apps.sarvam.ai/api/app-runtime/`

Methods:

await agent.start() — start and connect
await agent.stop() — stop and cleanup
await agent.wait_for_connect(timeout: float | None = 5.0) — wait until connected
await agent.wait_for_disconnect() — wait until disconnected or stopped
agent.is_connected() — connection status
await agent.send_audio(audio_bytes: bytes) — send raw 16‑bit PCM audio
agent.get_interaction_id() — current interaction id or None

Audio interface (optional): AsyncDefaultAudioInterface(input_sample_rate: int = 16000)

Methods: start(input_callback), output(audio: bytes, sample_rate?: int), interrupt(), stop()
Audio: LINEAR16 (16‑bit PCM mono). Supported sample rates: 8000, 16000

What you must provide: InteractionConfig

Required fields:

user_identifier_type: One of CUSTOM, EMAIL, PHONE_NUMBER, UNKNOWN
user_identifier: The identifier value (string; phone/email/custom id) # This id can be used to see logs in the log analyser
org_id: Your organization, e.g., “sarvamai”
workspace_id: Your workspace, e.g., “default”
app_id: The target application id
interaction_type: InteractionType.CALL (voice)
sample_rate: 8000 or 16000 (16-bit PCM mono)
version: int (Optional)

Important
If version is not provided, the SDK uses the latest committed version of the app.
The connection will fail if the provided app_id has no committed version.

Optional overrides (applied server-side at start):

agent_variables: dict of key/value to seed the agent context
initial_language_name: e.g., “English”, “Hindi” (must be allowed by app)
initial_state_name: starting state name, if your app uses states
initial_bot_message: first message from the agent

Example config:

from sarvam_conv_ai_sdk import InteractionConfig, InteractionType, SarvamToolLanguageName
from sarvam_conv_ai_sdk.messages.types import UserIdentifierType

config = InteractionConfig(
    user_identifier_type=UserIdentifierType.CUSTOM,
    user_identifier="demo_user_async",
    org_id="sarvamai",
    workspace_id="default",
    app_id="your_app_id",
    interaction_type=InteractionType.CALL,
    agent_variables={"user_language": "Hindi"},
    initial_language_name=SarvamToolLanguageName.HINDI,
    initial_state_name="greeting",
    sample_rate=16000,
)

Quick start: local voice test

Install dependencies

brew install portaudio               # macOS
pip install "sarvam-conv-ai-sdk[all]"

Set credentials (or pass directly in code)

export SARVAM_APP_ID="your_app_id"
export SARVAM_API_KEY="your_api_key"

Run the example

python -m sarvam_conv_ai_sdk.examples.async_audio_example

The example uses AsyncDefaultAudioInterface to capture mic at 16kHz and play responses. You can override base_url in AsyncSamvaadAgent if you use a different environment.

Headless mode (no PyAudio)

Use your own audio I/O. Create the agent without audio_interface and push raw 16‑bit PCM mono chunks that match config.sample_rate.

agent = AsyncSamvaadAgent(api_key=SecretStr("your_api_key"), config=config, text_callback=handle_text)
await agent.start()

# Send raw audio bytes
await agent.send_audio(raw_pcm_bytes)  # LINEAR16 mono at 16kHz or 8kHz

await agent.stop()

Connect your frontend (backend proxy pattern)

See the section above for AsyncSamvaadAgent usage. For a full backend bridge, follow the same pattern in your server. Message shapes:

Frontend → backend (init):

{
  "type": "init",
  "app_id": "your_app_id",
  "context": {"language": "English", "user_name": "Priya"}
}

Frontend → backend (text):

{ "type": "text", "data": { "text": "Hello" } }

Frontend → backend (audio):

{ "type": "audio", "data": "<base64-raw-pcm>" }

Bridge essentials on the backend:

Build InteractionConfig from init context; create AsyncSamvaadAgent with callbacks.
Decode base64 and forward audio via await agent.send_audio(audio_bytes).
In text/audio/event callbacks, websocket.send_json back to the frontend.

Minimal sketch:

session.agent = AsyncSamvaadAgent(
    api_key=SecretStr(api_key),
    config=config,
    text_callback=session._handle_text,
    audio_callback=session._handle_audio,
    event_callback=session._handle_event,
)
await session.agent.start()

Requirements for Async Audio

PyAudio installation:
```
pip install sarvam-conv-ai-sdk[all]
```
System dependencies:
- macOS: brew install portaudio
- Ubuntu/Debian: sudo apt-get install portaudio19-dev
- Windows: download from http://www.portaudio.com/download.html

Environment variables (optional convenience):

export SARVAM_APP_ID="your_app_id"
export SARVAM_API_KEY="your_api_key"

Complete Example

See sarvam_conv_ai_sdk/examples/async_audio_example.py for a full, runnable script with mic capture, callbacks, and clean shutdown.

Custom Tools

Example Usage

import httpx
from pydantic import Field

from sarvam_conv_ai_sdk import (
    SarvamInteractionTurnRole,
    SarvamOnEndTool,
    SarvamOnEndToolContext,
    SarvamOnStartTool,
    SarvamOnStartToolContext,
    SarvamTool,
    SarvamToolContext,
    SarvamToolLanguageName,
    SarvamToolOutput,
)

class OnStart(SarvamOnStartTool): #Name of the class has to be OnStart
    async def run(self, context: SarvamOnStartToolContext):
        user_id = context.get_user_identifier()
        async with httpx.AsyncClient() as client:
            response = await client.get(f"https://sarvam-flights.com/users/{user_id}")
            response.raise_for_status()
            user_data = response.json()

        source_destination = user_data.get("home_city")
        context.set_agent_variable("source_destination", source_destination)
        context.set_agent_variable("passenger_name", user_data.get("name"))
        
        # Store telephony call SID if available (for telephony channels)
        if context.provider_ref_id:
            context.set_agent_variable("call_sid", context.provider_ref_id)
        
        context.set_initial_language_name(SarvamToolLanguageName.ENGLISH)
        context.set_initial_bot_message(
            f"Hello! Would you like to book a flight from {source_destination}? Where would you like to go?",
        )
        return context


class BookFlight(SarvamTool):
    """Book a flight based on the user's travel preferences."""

    destination: str = Field(description="City of destination")
    travel_date: str = Field(description="Date of travel (YYYY-MM-DD)")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        source_destination = context.get_agent_variable("source_destination")
        booking_data = {
            "source": source_destination,
            "destination": self.destination,
            "travel_date": self.travel_date,
            "passenger_name": context.get_agent_variable("passenger_name"),
        }

        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://sarvam-flights.com/book", json=booking_data
            )
            response.raise_for_status()
            booking_result = response.json()

        if booking_result.get("status") == "confirmed":
            context.set_agent_variable("booking_id", booking_result.get("booking_id"))
            context.set_end_conversation()
            return SarvamToolOutput(
                message_to_user=f"Flight booked successfully to {self.destination}!",
                context=context,
            )
        else:
            context.change_state("recommend_destinations")
            return SarvamToolOutput(
                message_to_llm="Booking failed. Please suggest similar destinations.",
                context=context,
            )


class OnEnd(SarvamOnEndTool):  #Name of the class has to be OnEnd
    async def run(self, context: SarvamOnEndToolContext):
        feedback = context.get_agent_variable("feedback")
        negative_words = ["bad", "poor", "disappointed", "unhappy", "problem"]
        interaction_transcript = context.get_interaction_transcript()
        if interaction_transcript.interaction_transcript:
            for turn in interaction_transcript.interaction_transcript:
                if turn.role == SarvamInteractionTurnRole.USER:
                    is_negative = any(word in feedback.lower() for word in negative_words)
            context.set_agent_variable("feedback_sentiment", is_negative)
        
        # Log call details if telephony SID is available
        if context.provider_ref_id:
            async with httpx.AsyncClient() as client:
                await client.post(
                    "https://sarvam-flights.com/analytics/call-logs",
                    json={
                        "call_sid": context.provider_ref_id,
                        "user_id": context.get_user_identifier(),
                        "sentiment": is_negative,
                        "duration": (
                            interaction_transcript.interaction_end_time 
                            - interaction_transcript.interaction_start_time
                        ).total_seconds()
                    }
                )

        return context

Base Classes

The SDK exposes three base classes for tool development:

1. `SarvamTool`

Primary base class for all operational tools invoked during conversation flow. Example:

class MyCustomTool(SarvamTool):
    """Brief description of the tool's purpose."""

    tool_variable: type = Field(description="Description of this input parameter")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        # Custom tool logic
        return SarvamToolOutput(
            message_to_user="Response to user",
            message_to_llm="Context for LLM",
            context=context
        )

2. `SarvamOnStartTool`

Executed at the beginning of a conversation, typically for initialization. The class must be named OnStart.

3. `SarvamOnEndTool`

Executed at the end of a conversation, typically for cleanup or post-processing. The class must be named OnEnd.

Context Classes and Methods

`SarvamToolContext`

The context object passed to SarvamTool.run() methods.

Variable Management

get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.
set_agent_variable(variable_name: str, value: Any) -> None Update a variable’s value.

Language Control

get_current_language() -> SarvamToolLanguageName Returns the current language of the agent.
change_language(language: SarvamToolLanguageName) -> None Update the language preference.

Conversation Flow

set_end_conversation() -> None Explicitly end the conversation.

State Management

get_current_state() -> str Returns the current state of the conversation.
change_state(state: str) -> None Transition to a new state. Note: The new state must be one of the next valid states defined in the agent configuration.

Engagement Metadata

get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

`SarvamOnStartToolContext`

The context object passed to SarvamOnStartTool.run() methods.

Variable Management

get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.
set_agent_variable(variable_name: str, value: Any) -> None Update a variable’s value.

User Information

get_user_identifier() -> str Get the user identifier.

Telephony Information

provider_ref_id: Optional[str] The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to None for channels that don’t provide reference IDs.

Initialization Methods

set_initial_bot_message(message: str) -> None Set the first message sent by the agent when the conversation starts.
set_initial_state_name(state_name: str) -> None Set the initial state from which the agent should start.
set_initial_language_name(language: SarvamToolLanguageName) -> None Define the initial language preference for the user.

Engagement Metadata

get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

`SarvamOnEndToolContext`

The context object passed to SarvamOnEndTool.run() methods.

Variable Management

get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.
set_agent_variable(variable_name: str, value: Any) -> None Update a variable’s value.

User Information

get_user_identifier() -> str Get the user identifier.

Telephony Information

provider_ref_id: Optional[str] The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to None for channels that don’t provide reference IDs.

Engagement Metadata

get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

Interaction Reattempt

set_retry_interaction The user will be reattempted with the same agent. Useful when any business goal has not been met.

Interaction Transcript

get_interaction_transcript() -> SarvamInteractionTranscript Retrieve the conversation history containing user and agent messages in English and the timestamp when the conversation began and ended. Format: yyyy-mm-dd hh:mm:ss

Example transcript:

[
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Hello! How can I help you today?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I need to book a flight'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='I can help you with that. Where would you like to go?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I want to go to Mumbai'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Great! When would you like to travel?')
]

Return Types

`SarvamToolOutput`

The return type for SarvamTool.run() methods. Contains:

message_to_user: Optional[str] - Message that is sent directly to the user
message_to_llm: Optional[str] - Message that is sent to the LLM, which then responds
context: SarvamToolContext - The updated context object

Note: At least one of message_to_llm or message_to_user must be set. Important: When both message_to_user and message_to_llm are set, only the message_to_user is actually sent to the user, but the message_to_llm overrides the message_to_user when adding to the chat thread for the LLM’s context.

`EngagementMetadata`

The engagement metadata object that can be retrieved from context objects using get_engagement_metadata(). Contains:

interaction_id: str - Unique identifier for each conversation between user & agent.
attempt_id: Optional[str] - Unique identifier for each attempt created on the platform
campaign_id: Optional[str] - Campaign ID for the interaction
interaction_language: SarvamToolLanguageName - The language used for the interaction (defaults to English)
app_id: str - Application identifier of the agent for the interaction
app_version: int - Version number of the agent
agent_phone_number: Optional[str] - Phone number associated with the conversational agent application

Supported Languages

The SDK supports multilingual conversations using the SarvamToolLanguageName enum. Available languages include:

Bengali
Gujarati
Kannada
Malayalam
Tamil
Telugu
Punjabi
Odia
Marathi
Hindi
English

Note: The allowed languages are actually a subset that is preselected while defining the agent configurations.

Best Practices

Always implement run(): The run() method is the entry point for tool execution logic.
Use Field() for parameters: Ensures type safety and adds descriptive metadata necessary for LLM to use in the prompt.
Gracefully handle errors: Avoid accessing unset variables or using invalid types.
Return the appropriate type: SarvamTool.run() must return SarvamToolOutput, while SarvamOnStartTool.run() and SarvamOnEndTool.run() return their respective context objects.
Write meaningful docstrings: Clearly describe what each tool is intended to do as this directly impacts the performance of tool calling capabilities of the agent.
Use async operations for I/O: For the best performance, use async/await for external API calls to avoid blocking.
Use context methods: Use the provided context methods for variable management, language control, and messaging instead of directly accessing context attributes.

Testing Your Tools

After creating a tool, you can test it locally to ensure it works as expected. Here’s how to test your tools:

Testing Steps

Create the ToolContext: Initialize the appropriate context object with test data
Instantiate the tool class: Use tool.model_validate(tool_args) to create a tool instance
Run the tool: Call the tool’s run() method with the context
Observe the returned object: Check if the necessary changes have been made to the context

Example Test: SarvamTool

# Test the BookFlight tool
async def test_book_flight():
    # 1. Create the ToolContext
    context = SarvamToolContext(
        language=SarvamToolLanguageName.ENGLISH,
        allowed_languages=[SarvamToolLanguageName.ENGLISH],
        state="booking",
        next_valid_states=["recommend_destinations", "end"],
        agent_variables={
            "source_destination": "Mumbai",
            "passenger_name": "John Doe",
            "booking_id": "123"
        },
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
    )
    
    # 2. Instantiate the tool class
    tool_args = {
        "destination": "Delhi",
        "travel_date": "2024-03-15"
    }
    tool_instance = BookFlight.model_validate(tool_args)
    
    # 3. Run the tool
    result = await tool_instance.run(context)
    
    # 4. Observe the returned object
    print(f"Message to user: {result.message_to_user}")
    print(f"Message to LLM: {result.message_to_llm}")
    print(f"End conversation: {result.context.end_conversation}")
    print(f"Current state: {result.context.get_current_state()}")
    print(f"Agent variables: {result.context.agent_variables}")
    print(f"Current Language: {result.context.get_current_language()}")

# Run the test
asyncio.run(test_book_flight())

Example Test: OnStart Tool

For SarvamOnStartTool, the testing approach is similar but it returns the context object directly:

# Testing OnStart tool
async def test_on_start():
    context = SarvamOnStartToolContext(
        user_identifier="user123",
        agent_variables={"source_destination": "Mumbai", "passenger_name": "John Doe"},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        initial_bot_message=None,
        initial_state_name="start",
        initial_language_name=SarvamToolLanguageName.ENGLISH,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnStart()
    result = await tool_instance.run(context)
    
    print(f"Initial bot message: {result.initial_bot_message}")
    print(f"Initial state: {result.initial_state_name}")
    print(f"Initial Language Name: {result.initial_language_name}")
    print(f"Agent variables: {result.agent_variables}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_start())

Example Test: OnEnd Tool

# Testing OnEnd tool
async def test_on_end():
    context = SarvamOnEndToolContext(
        user_identifier="user123",
        agent_variables={"feedback": "I had a bad experience", "feedback_sentiment": False},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        interaction_transcript=SarvamInteractionTranscript(
            interaction_transcript=[
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Hello! How can I help you today?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I need to book a flight'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='I can help you with that. Where would you like to go?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I want to go to Mumbai'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Great! When would you like to travel?')
            ],
            interaction_start_time=datetime.now() - timedelta(minutes=2),
            interaction_end_time=datetime.now(),
        ),
        retry_interaction=False,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnEnd()
    result = await tool_instance.run(context)
    
    print(f"Agent variables: {result.agent_variables}")
    print(f"Interaction Retry: {result.retry_interaction}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_end())

Requirements for Async Audio

PyAudio Installation:
```
pip install sarvam-conv-ai-sdk[all]
```
System Dependencies:
- macOS: brew install portaudio
- Ubuntu/Debian: sudo apt-get install portaudio19-dev
- Windows: Download from http://www.portaudio.com/download.html

Environment Variables:

export SARVAM_APP_ID="your_app_id"
export SARVAM_API_KEY="your_api_key"

Best Practices for Async Audio

Use proper event loop setup for PyAudio compatibility:

loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

Handle connection states gracefully:

while agent.is_connected():
    await asyncio.sleep(1)

Implement proper cleanup in finally blocks:
```
finally:
    await agent.stop()
```
Use appropriate sample rates (typically 16000 Hz for input)

Handle interruptions with KeyboardInterrupt:

except KeyboardInterrupt:
    print("Stopping conversation...")

Complete Example

See sarvam_conv_ai_sdk/examples/async_audio_example.py for a complete working script.

Conversational AI SDKs

​Overview

​Installation

​Basic Installation

​Audio Support (Optional)

​Option 1: Install with audio support

​Option 2: Use without PyAudio

​AsyncSamvaadAgent

​Key features

​AsyncSamvaadAgent parameters

​What you must provide: InteractionConfig

​Quick start: local voice test

​Headless mode (no PyAudio)

​Connect your frontend (backend proxy pattern)

​Requirements for Async Audio

​Complete Example

​Custom Tools

​Example Usage

​Base Classes

​1. SarvamTool

​2. SarvamOnStartTool

​3. SarvamOnEndTool

​Context Classes and Methods

​SarvamToolContext

​Variable Management

​Language Control

​Conversation Flow

​State Management

​Engagement Metadata

​SarvamOnStartToolContext

​Variable Management

​User Information

​Telephony Information

​Initialization Methods

​Engagement Metadata

​SarvamOnEndToolContext

​Variable Management

​User Information

​Telephony Information

​Engagement Metadata

​Interaction Reattempt

​Interaction Transcript

​Return Types

​SarvamToolOutput

​EngagementMetadata

​Supported Languages

​Best Practices

​Testing Your Tools

​Testing Steps

​Example Test: SarvamTool

​Example Test: OnStart Tool

​Example Test: OnEnd Tool

​Requirements for Async Audio

​Best Practices for Async Audio

​Complete Example

Overview

Installation

Basic Installation

Audio Support (Optional)

Option 1: Install with audio support

Option 2: Use without PyAudio

AsyncSamvaadAgent

Key features

AsyncSamvaadAgent parameters

What you must provide: InteractionConfig

Quick start: local voice test

Headless mode (no PyAudio)

Connect your frontend (backend proxy pattern)

Requirements for Async Audio

Complete Example

Custom Tools

Example Usage

Base Classes

1. `SarvamTool`

2. `SarvamOnStartTool`

3. `SarvamOnEndTool`

Context Classes and Methods

`SarvamToolContext`

Variable Management

Language Control

Conversation Flow

State Management

Engagement Metadata

`SarvamOnStartToolContext`

Variable Management

User Information

Telephony Information

Initialization Methods

Engagement Metadata

`SarvamOnEndToolContext`

Variable Management

User Information

Telephony Information

Engagement Metadata

Interaction Reattempt

Interaction Transcript

Return Types

`SarvamToolOutput`

`EngagementMetadata`

Supported Languages

Best Practices

Testing Your Tools

Testing Steps

Example Test: SarvamTool

Example Test: OnStart Tool

Example Test: OnEnd Tool

Requirements for Async Audio

Best Practices for Async Audio

Complete Example