Case study
Telegram RAG Assistant
A Telegram polling bot that runs RAG over semantic conversation history (Vectorize + embeddings), supports image analysis (vision), and image generation (Replicate), backed by Postgres/Redis with a Django admin/API.
Problem
- A chat assistant needs long-term memory and retrieval without stuffing everything into the prompt.
- Multimodal support (image analysis/generation) requires reliable storage and safe orchestration.
- You need persistence, rate limiting, and instrumentation to operate the bot over time.
Solution
- Implemented a polling Telegram bot (python-telegram-bot) that orchestrates RAG + image workflows.
- Stored system-of-record data in Postgres (users, conversations, messages, analytics) and used Redis for sessions/rate limiting.
- Generated embeddings via OpenAI embeddings API and stored/query vectors in Cloudflare Vectorize for semantic conversation retrieval.
- Used OpenRouter for chat and vision analysis; used Replicate for image generation and returned results via Telegram.
- Stored images in Cloudflare R2 and served them back via presigned URLs.
Architecture
Results
<2s response time
For grounded RAG answers with citations
Semantic memory
Conversation history stored and retrieved via embeddings
Multi-provider AI
Chat, vision, and image generation unified behind one bot
Rate-limited
Redis-backed limits prevent abuse and control costs
How this maps to client use cases
- Support bots that recall prior conversations and stay grounded in internal context.
- Ops assistants that handle screenshots/photos (vision) and provide structured answers.
- Workflow bots that integrate with multiple AI providers behind a single safety-controlled layer.