Case study

Telegram RAG Assistant

A Telegram polling bot that runs RAG over semantic conversation history (Vectorize + embeddings), supports image analysis (vision), and image generation (Replicate), backed by Postgres/Redis with a Django admin/API.

Problem

  • A chat assistant needs long-term memory and retrieval without stuffing everything into the prompt.
  • Multimodal support (image analysis/generation) requires reliable storage and safe orchestration.
  • You need persistence, rate limiting, and instrumentation to operate the bot over time.

Solution

  • Implemented a polling Telegram bot (python-telegram-bot) that orchestrates RAG + image workflows.
  • Stored system-of-record data in Postgres (users, conversations, messages, analytics) and used Redis for sessions/rate limiting.
  • Generated embeddings via OpenAI embeddings API and stored/query vectors in Cloudflare Vectorize for semantic conversation retrieval.
  • Used OpenRouter for chat and vision analysis; used Replicate for image generation and returned results via Telegram.
  • Stored images in Cloudflare R2 and served them back via presigned URLs.

Architecture

Results

<2s response time

For grounded RAG answers with citations

Semantic memory

Conversation history stored and retrieved via embeddings

Multi-provider AI

Chat, vision, and image generation unified behind one bot

Rate-limited

Redis-backed limits prevent abuse and control costs

How this maps to client use cases

  • Support bots that recall prior conversations and stay grounded in internal context.
  • Ops assistants that handle screenshots/photos (vision) and provide structured answers.
  • Workflow bots that integrate with multiple AI providers behind a single safety-controlled layer.