Go to article URL

In this post, we will show you how to build an intelligent FAQ bot using Python and the Telegram Bot API. We’ll go beyond simple commands by integrating a Retrieval-Augmented Generation (RAG) pipeline with LangChain.

This RAG pipeline lets our bot pull information from a custom knowledge base (in our case, a simple faqs.json file) and use a local Large Language Model (LLM) through Ollama to generate accurate answers. The best part? This approach (which works great with interfaces like Open WebUI) gives you full control over your models and data with zero API costs.

What is Telegram?

You’ve probably heard of Telegram—it’s a popular, cloud-based instant messaging app. It’s fast, works everywhere (mobile, web, and desktop), and has powerful features like huge group chats and easy file sharing.

One of its most powerful features for developers is the Telegram Bot API, an open platform that allows anyone to build and integrate automated applications (like ours!) directly into the chat interface.

A Warning on Privacy and Encryption

Before we build our bot, it is critical to understand how Telegram handles encryption, as it directly impacts user privacy.

This means that any message a user sends to our bot is a “Cloud Chat” and is not end-to-end encrypted. The data is accessible to Telegram and will be processed in plain text by our bot.py script on our server.

For this reason, you should never build a bot that asks for or encourages users to send sensitive private data such as passwords, financial information, or social security numbers. Always treat bot conversations as non-private.

What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is a technique that makes Large Language Models (LLMs) smarter by connecting them to external, private knowledge.

In short, instead of just asking the bot “What’s the shipping policy?”, we’re effectively asking, “Based on this specific text: ‘…We offer standard shipping…’ — what is the shipping policy?” This forces the LLM to base its answer on our facts, not its own general knowledge, making the response accurate and reliable.

What you’ll build

Prerequisites

You’ll need:

Setting up the Project for Telegram Bot

uv is a high-performance Python package manager, so we’ll use it to set up our project. If you don’t have it installed, you can get it with or visit the site for installation steps:

pip install uv

Create a new project directory and navigate into it:

mkdir telegram-rag-bot
cd telegram-rag-bot

Initialize a new Python project.

uv init --bare

This command creates a minimal pyproject.toml file. This file will track our project’s metadata and, most importantly, its dependencies.

Create a virtual environment using uv:

uv venv

This will create a .venv directory. Activate it with the following:

source .venv/bin/activate

# On Windows, use
#.venv\Scripts\activate

Install the necessary Python packages using uv:

uv add python-telegram-bot python-dotenv langchain langchain-openai langchain-community faiss-cpu jq sentence-transformers

The key libraries are:

Environment and configuration

The bot reads the Telegram token from the environment variable BOT_TOKEN. We can store it in a .env file as BOT_TOKEN=your-token-here.

# .env (Open WebUI)
# Open WebUI_URL must end with /v1 (e.g., http://localhost:3000/v1).
BOT_TOKEN=123456:abcdefg
Open WebUI_URL=http://localhost:3000/v1
Open WebUI_API_KEY=your_key_here

Inline mode requires enabling inline for the bot via BotFather.

Create a new file named bot.py and add the following code to set up and add message handlers for the Telegram bot.

import logging
import os
from uuid import uuid4
from telegram import Update, InlineQueryResultArticle, InputTextMessageContent
from telegram.ext import filters, MessageHandler, ApplicationBuilder, CommandHandler, ContextTypes, InlineQueryHandler
from dotenv import load_dotenv

# load .env variables
load_dotenv()
bot_token = os.getenv("BOT_TOKEN", "")

# Setup logging
logging.basicConfig(
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    level=logging.INFO
)
logger = logging.getLogger(__name__)

async def start(update: Update, context: ContextTypes.DEFAULT_TYPE):
    await context.bot.send_message(chat_id=update.effective_chat.id, text="I'm a bot, please talk to me!")

async def echo(update: Update, context: ContextTypes.DEFAULT_TYPE):
    await context.bot.send_message(chat_id=update.effective_chat.id, text=update.message.text)

async def caps(update: Update, context: ContextTypes.DEFAULT_TYPE):
    text_caps = ' '.join(context.args).upper()
    await context.bot.send_message(chat_id=update.effective_chat.id, text=text_caps)

async def inline_caps(update: Update, context: ContextTypes.DEFAULT_TYPE):
    query = update.inline_query.query
    if not query:
        return
    results = []
    results.append(
        InlineQueryResultArticle(
            id=str(uuid4()),
            title='Caps',
            input_message_content=InputTextMessageContent(query.upper())
        )
    )
    await context.bot.answer_inline_query(update.inline_query.id, results)

async def unknown(update: Update, context: ContextTypes.DEFAULT_TYPE):
    await context.bot.send_message(chat_id=update.effective_chat.id, text="Sorry, I didn't understand that command.")

async def document(update: Update, context: ContextTypes.DEFAULT_TYPE):
    if update.message.document:
        file = await update.message.document.get_file()
        file_name = update.message.document.file_name
        await file.download_to_drive(file_name)
    elif update.message.photo:
        # Get the largest photo size
        file = await update.message.photo[-1].get_file()
        file_name = f"photo_{file.file_unique_id}.jpg" # Create a unique name for photos
        await file.download_to_drive(file_name)
    elif update.message.video:
        file = await update.message.video.get_file()
        file_name = update.message.video.file_name
        await file.download_to_drive(file_name)
    else:
        await update.message.reply_text("Please send a document, photo, or video.")
        return

def main() -> None:
    start_handler = CommandHandler('start', start)
    echo_handler = MessageHandler(filters.TEXT & (~filters.COMMAND), echo)
    caps_handler = CommandHandler('caps', caps)
    inline_caps_handler = InlineQueryHandler(inline_caps)
    document_handler = MessageHandler(filters.PHOTO | filters.Document.PDF | filters.VIDEO, document)
    unknown_handler = MessageHandler(filters.COMMAND, unknown)

    application = ApplicationBuilder().token(bot_token).build()

    application.add_handler(start_handler)
    application.add_handler(echo_handler)
    application.add_handler(caps_handler)
    application.add_handler(inline_caps_handler)
    application.add_handler(document_handler)
    application.add_handler(unknown_handler)

    # Run the bot
    logger.info("Starting bot polling...")
    application.run_polling()

if __name__ == '__main__':
    main()

Try it out

To run the application, simply run:

uv run bot.py

Search for your bot name in Telegram and send the bot a message or command like /start or /caps.

What this bot does

Core structure

Handlers and filters

Running the bot

main() wires up the handlers, builds the Application with the token, logs a startup message, and starts long polling via application.run_polling().

This script is a clean, async-first Telegram bot scaffold that demonstrates commands, inline mode, message filtering, and media downloads—ready to extend for more sophisticated behaviors.

Now that we have our bot ready, we will extend the code to add a RAG pipeline to the bot.

Setting up the knowledge base

Let’s set up a knowledge base by creating a file named faqs.json to hold our data. The RAG pipeline will load and search this content. An example structure is shown below.

[
 {
    "category": "General",
    "question": "What are your operating hours?",
    "answer": "Monday to Friday, 9:00 AM–5:00 PM (local time)."
  },
  {
    "category": "Accounts",
    "question": "How do I reset my password?",
    "answer": "Go to our website, click Login, then Forgot Password. Check your email for the reset link."
  }
]

Setting up the RAG Pipeline

The RAG pipeline is the engine that converts our static JSON file into a searchable brain for our bot. This part initializes once and creates a vector database. In simple steps,

Data Ingestion (Indexing the FAQs) handled by setup_rag_chain() method

This part happens once when the bot starts. We load the faqs.json file, create vector embeddings, and store them in a searchable database (FAISS).

The RAG Retrieval Logic handled by handle_message() method

When a user asks a question:

New capabilities added to bot.py

The core logic of our bot will revolve around an update to the standard message handler. When a user sends a question, the bot no longer looks for a simple command; instead, it passes the question to the RAG pipeline.

Try it out

To run the application, simply run

uv run bot.py

Search for your bot name in telegram and send the bot a message like What are your operating hours?.

Changes to the RAG pipeline setup are available here. The source code is available here.

By integrating a RAG pipeline, we’ve leveled up our Telegram bot from a simple command processor to a knowledge-aware assistant. This approach ensures our bot’s answers are accurate, grounded in our provided faqs.json data, and remain consistent, dramatically reducing the chance of “hallucinations” from the underlying LLM.

This architecture is powerful and scalable. To expand its capabilities, we only need to update the faqs.json file and re-run the indexing step—no need to retrain or modify the core LLM!

www.endpointdev.com/blog/feed.xml
programming | reporting