ASRInferenceEngine Download - ASRInferenceEngine Source code download

ASRInferenceEngine

AI Source Code

1.0.0

Download

Inference Server for ASR models

Overview

This is a FastAPI-based server that acts as a interface between your application and cloud-based AI services. It focuses on three main tasks:

Converting speech to text (transcription)
Converting text to speech
Converting speech to speech (a combination of the above two)

Currently, it uses OpenAI's API for these services, but it's designed so we can add other providers in the future.

Features

Transcription (Speech-to-Text)
- Asynchronous file upload and transcription
- Streaming transcription via WebSocket
Text-to-Speech
- Convert text to speech with various voice options
Speech-to-Speech
- Convert speech input to text and then back to speech
- Support for both file upload and streaming via WebSocket

Project Structure

.
├── cloud_providers/
│   ├── base.py
│   └── openai_api_handler.py
├── server/
│   ├── main.py
│   ├── routers/
│   │   ├── transcribe.py
│   │   ├── tts.py
│   │   └── speech_to_speech.py
│   └── utils/
│       └── logger.py
|
└── requirements.txt
└── README.md

Setup and Installation

Clone the repository

Create a virtual environment:

python -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements
```

Set up environment variables:

export OPENAI_API_KEY=your_openai_api_key

Starting the Server

To start the server, navigate to the project directory and run:

python server/main.py

This will start the FastAPI server, typically on http://localhost:8000.

For more details about API check API docs

Logging

The application uses rotating file handlers for logging, with separate log files for different components:

logs/main.log: Main application logs
logs/transcription.log: Transcription-specific logs
logs/tts.log: Text-to-speech logs
logs/speech_to_speech.log: Speech-to-speech logs

Error Handling

The application includes error handling for various scenarios, including API errors and WebSocket disconnections. Errors are logged and appropriate HTTP exceptions are raised.

Extensibility

The project is designed with extensibility in mind. The CloudProviderBase abstract base class in base.py allows for easy integration of additional cloud providers beyond OpenAI.

Security Considerations

Ensure that your OpenAI API key is kept secure and not exposed in the code or version control.
The server currently allows all origins in CORS settings. In a production environment, you should restrict this to specific allowed origins.