Case study

Telegram RAG Assistant

A Telegram polling bot that runs RAG over semantic conversation history (Vectorize + embeddings), supports image analysis (vision), and image generation (Replicate), backed by Postgres/Redis with a Django admin/API.

Problem

A chat assistant needs long-term memory and retrieval without stuffing everything into the prompt.
Multimodal support (image analysis/generation) requires reliable storage and safe orchestration.
You need persistence, rate limiting, and instrumentation to operate the bot over time.

Solution

Implemented a polling Telegram bot (python-telegram-bot) that orchestrates RAG + image workflows.
Stored system-of-record data in Postgres (users, conversations, messages, analytics) and used Redis for sessions/rate limiting.
Generated embeddings via OpenAI embeddings API and stored/query vectors in Cloudflare Vectorize for semantic conversation retrieval.
Used OpenRouter for chat and vision analysis; used Replicate for image generation and returned results via Telegram.
Stored images in Cloudflare R2 and served them back via presigned URLs.

Architecture

Results

<2s response time

For grounded RAG answers with citations

Semantic memory

Conversation history stored and retrieved via embeddings

Multi-provider AI

Chat, vision, and image generation unified behind one bot

Rate-limited

Redis-backed limits prevent abuse and control costs

How this maps to client use cases

Support bots that recall prior conversations and stay grounded in internal context.
Ops assistants that handle screenshots/photos (vision) and provide structured answers.
Workflow bots that integrate with multiple AI providers behind a single safety-controlled layer.