Skip to content

πŸŽ™οΈ AI-powered Telegram bot for voice-to-text transcription using OpenAI Whisper. CPU-only, no GPU required, privacy-focused with local processing.

License

Notifications You must be signed in to change notification settings

Malith-Rukshan/whisper-transcriber-bot

Repository files navigation

TranscriberXBOT

πŸŽ™οΈ Whisper Transcriber Bot

Telegram Docker License

✨ Transform voice into text instantly with AI-powered transcription magic! ✨

- A self-hosted, privacy-focused transcription for Telegram -
πŸš€ No GPU Required β€’ No API Keys β€’ CPU-Only β€’ Low Resource Usage ツ

✨ Features

  • πŸŽ™οΈ Voice Transcription - Convert voice messages to text instantly
  • 🎡 Multi-Format Support - MP3, M4A, WAV, OGG, FLAC audio files
  • ⚑ Concurrent Processing - Handle multiple users simultaneously
  • πŸ“ Smart Text Handling - Auto-generate text files for long transcriptions
  • 🧠 AI-Powered - OpenAI Whisper model for accurate transcription
  • πŸ’» CPU-Only Processing - No GPU required, runs on basic servers (512MB RAM minimum)
  • 🚫 No API Dependencies - No external API keys or cloud services needed
  • 🐳 Docker Ready - Easy deployment with containerization
  • πŸ”’ Privacy Focused - Process audio locally, complete data privacy
  • πŸ’° Cost Effective - Ultra-low resource usage, perfect for budget hosting

🎬 Demo

Whisper Transcriber Bot Demo

πŸš€ One-Click Deploy

Deploy to Heroku Deploy to Render
Deploy to DO

Deploy instantly to your favorite cloud platform with pre-configured settings! All platforms support CPU-only deployment - no GPU needed!

πŸ“ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Telegram Bot Token (Create Bot)

Installation

  1. Clone the repository

    git clone https://github.com/Malith-Rukshan/whisper-transcriber-bot.git
    cd whisper-transcriber-bot
  2. Configure environment

    cp .env.example .env
    nano .env  # Add your bot token
  3. Download AI model

    chmod +x download_model.sh
    ./download_model.sh
  4. Deploy with Docker

    docker-compose up -d

πŸŽ‰ That's it! Your bot is now running and ready to transcribe audio!

πŸ“‹ Usage

Bot Commands

Command Description
/start 🏠 Welcome message and bot introduction
/help πŸ“– Detailed usage instructions
/about ℹ️ Bot information and developer details
/status πŸ” Check bot health and configuration

How to Use

  1. Send Audio πŸŽ™οΈ - Forward voice messages or upload audio files
  2. Wait for AI ⏳ - Bot processes audio (typically 1-3 seconds)
  3. Get Text πŸ“ - Receive transcription or download text file for long content

Supported Formats

  • Voice Messages - Direct Telegram voice notes
  • Audio Files - MP3, M4A, WAV, OGG, FLAC (up to 50MB)
  • Document Audio - Audio files sent as documents

🐳 Docker Deployment

Cloud Deployment (Recommended)

Perfect for cloud platforms like Render, Railway, etc. The model is included in the image.

version: '3.8'
services:
  whisper-bot:
    image: malithrukshan/whisper-transcriber-bot:latest
    container_name: whisper-transcriber-bot
    restart: unless-stopped
    environment:
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}

Local Development with Volume Mount

For local development where you want to persist models between container rebuilds:

version: '3.8'
services:
  whisper-bot:
    image: malithrukshan/whisper-transcriber-bot:latest
    container_name: whisper-transcriber-bot
    restart: unless-stopped
    environment:
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
    volumes:
      - ./models:/app/models

Using Docker CLI

# Cloud deployment (model included in image)
docker run -d \
  --name whisper-bot \
  -e TELEGRAM_BOT_TOKEN=your_token_here \
  malithrukshan/whisper-transcriber-bot:latest

# Local development (with volume mount)
docker run -d \
  --name whisper-bot \
  -e TELEGRAM_BOT_TOKEN=your_token_here \
  -v $(pwd)/models:/app/models \
  malithrukshan/whisper-transcriber-bot:latest

πŸ› οΈ Development

Local Development Setup

# Clone and setup
git clone https://github.com/Malith-Rukshan/whisper-transcriber-bot.git
cd whisper-transcriber-bot

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt  # Development dependencies

# Download model
./download_model.sh

# Configure environment
cp .env.example .env
# Add your bot token to .env

# Run bot
python src/bot.py

Project Structure

whisper-transcriber-bot/
β”œβ”€β”€ src/                    # Source code
β”‚   β”œβ”€β”€ bot.py             # Main bot application
β”‚   β”œβ”€β”€ transcriber.py     # Whisper integration
β”‚   β”œβ”€β”€ config.py          # Configuration management
β”‚   └── utils.py           # Utility functions
β”œβ”€β”€ tests/                 # Test files
β”‚   β”œβ”€β”€ test_bot.py        # Bot functionality tests
β”‚   └── test_utils.py      # Utility function tests
β”œβ”€β”€ .github/workflows/     # CI/CD automation
β”œβ”€β”€ models/                # AI model storage
β”œβ”€β”€ Dockerfile            # Container configuration
β”œβ”€β”€ docker-compose.yml    # Deployment setup
β”œβ”€β”€ requirements.txt      # Production dependencies
β”œβ”€β”€ requirements-dev.txt  # Development dependencies
└── README.md            # This file

πŸ§ͺ Testing

Running Tests

# Run all tests
python -m pytest

# Run with coverage
python -m pytest --cov=src

# Run specific test file
python -m pytest tests/test_bot.py

# Run with verbose output
python -m pytest -v

Test Coverage

The test suite covers:

  • βœ… Bot initialization and configuration
  • βœ… Command handlers (/start, /help, /about, /status)
  • βœ… Audio processing workflow
  • βœ… Utility functions
  • βœ… Error handling scenarios

Code Quality

# Format code
black src/ tests/

# Security check
bandit -r src/

πŸ”§ Configuration

Environment Variables

Variable Description Default
TELEGRAM_BOT_TOKEN Bot token from @BotFather Required
WHISPER_MODEL_PATH Path to Whisper model file models/ggml-base.en.bin
WHISPER_MODEL_NAME Model name for display base.en
BOT_USERNAME Bot username for branding TranscriberXBOT
MAX_AUDIO_SIZE_MB Maximum audio file size 50
SUPPORTED_FORMATS Supported audio formats mp3,m4a,wav,ogg,flac
LOG_LEVEL Logging verbosity INFO

Performance Tuning

# Adjust CPU threads for transcription
export WHISPER_THREADS=4

# Set memory limits
export WHISPER_MAX_MEMORY=512M

# Configure concurrent processing
export MAX_CONCURRENT_TRANSCRIPTIONS=5

πŸ“Š Performance Metrics

Audio Length Processing Time Memory Usage
30 seconds ~1.2 seconds ~180MB
2 minutes ~2.8 seconds ~200MB
5 minutes ~6.1 seconds ~220MB

Scaling Recommendations

  • Single Instance: Handles 50+ concurrent users
  • Minimal Resources: 1 CPU core, 512MB RAM minimum (no GPU required!)
  • Storage: 1GB for model + temporary files
  • Cost-Effective: Perfect for budget VPS hosting ($5-10/month)
  • No External APIs: Zero ongoing API costs or dependencies
  • Load Balancing: Deploy multiple instances for higher traffic

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Quick Contribution Steps

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guide
  • Write tests for new features
  • Update documentation
  • Ensure Docker build succeeds
  • Run quality checks before PR

πŸ“ˆ Technical Architecture

Core Components

  • Framework: python-telegram-bot v22.2
  • AI Model: OpenAI Whisper base.en (147MB, CPU-optimized)
  • Bindings: pywhispercpp for C++ performance (no GPU needed)
  • Runtime: Python 3.11 with asyncio for concurrent processing
  • Container: Multi-architecture Docker (AMD64/ARM64)
  • Resource: CPU-only inference, minimal memory footprint

Performance Features

  • Async Processing: Non-blocking audio transcription
  • Concurrent Handling: Multiple users supported simultaneously
  • Memory Management: Efficient model loading and cleanup
  • Error Recovery: Robust error handling and logging

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenAI - For the incredible Whisper speech recognition model
  • pywhispercpp - High-performance Python bindings for whisper.cpp
  • python-telegram-bot - Excellent Telegram Bot API framework
  • whisper.cpp - Optimized C++ implementation of Whisper

πŸ‘¨β€πŸ’» Developer

Malith Rukshan


⭐ Star History

Star History Chart

If this project helped you, please consider giving it a ⭐!

Made with ❀️ by Malith Rukshan

πŸš€ Try the Bot β€’ ⭐ Star on GitHub β€’ 🐳 Docker Hub

About

πŸŽ™οΈ AI-powered Telegram bot for voice-to-text transcription using OpenAI Whisper. CPU-only, no GPU required, privacy-focused with local processing.

Topics

Resources

License

Stars

Watchers

Forks