Case Study
Tiger Agent: Enterprise RAG Knowledge Assistant for Telecom

Bangladesh’s Leading Telecom Company
- Telecommunications / Enterprise AI
Bangladesh
Overview
Bangladesh's largest telecommunications operator manages a vast and constantly evolving library of internal documentation, from product policies and technical manuals to operational procedures; spread across formats and silos with no unified access layer. Support and operations teams had no efficient way to retrieve precise answers from this documentation, leading to slow resolution times and inconsistent responses.
Vivasoft designed and built Tiger Agent, a production-grade Retrieval-Augmented Generation (RAG) platform that transforms unstructured enterprise documents into an AI-powered knowledge assistant. The system ingests PDFs and diverse document formats, converts them into vector embeddings stored in a dedicated Weaviate vector database, and answers natural language queries by combining semantic understanding with keyword precision through hybrid search.
The result is an enterprise assistant capable of delivering context-grounded answers with 92–96% retrieval precision and an average response time of 6–10 seconds — with complete data sovereignty, no per-seat licensing costs, and a security posture built for a regulated telecommunications environment.
Technologies Used
Python
FastAPI
OpenAI
MongoDB
Redis
React
TypeScript
Ant Design
Tailwind CSS
Challenges
- Unstructured, Heterogeneous Documentation: Internal knowledge was spread across PDFs, reports, and documents in inconsistent formats with no standardized structure, making automated extraction and reliable chunking difficult.
- Retrieval Consistency and Precision: Pure semantic search produced inconsistent results on domain-specific telecom terminology, where exact keyword matches matter as much as conceptual relevance.
- LLM Calibration for Grounded Responses: Out-of-the-box LLM behavior required careful prompt engineering and temperature tuning to prevent hallucinations and ensure responses stayed grounded in source documentation.
- Query Classification and Fallback Handling: The system needed to correctly distinguish between queries it could answer from the knowledge base, queries answerable from conversation context, and queries that genuinely had no matching information.
- Data Sovereignty and Security: As a telecom operator, the client required all data to remain within a controlled infrastructure with no dependency on third-party hosted vector services, and robust access control throughout.
- Enterprise Operability: Beyond the chatbot itself, the client needed tools to manage documents, monitor usage, review conversation history, cache frequent responses, and track system performance, all without developer intervention.
Solutions
- Hybrid Search with Weaviate: Implemented a hybrid retrieval pipeline combining vector-based semantic search with keyword (BM25) scoring, tuned via configurable alpha weighting, to handle both conceptual queries and precise term lookups across telecom-specific content.
- Pattern-Based Document Processing Pipeline: Built an automated ingestion pipeline which initially dependent on LangChain but eventually shifted to a customized approach and also PyPDF2 that parses, cleans, and chunks documents into semantically coherent units, generates OpenAI embeddings, and loads them into Weaviate with deduplication logic to prevent redundant entries.
- Query Rewriting and Contextual Rephrasing: Before retrieval, user queries are rephrased using conversation history to resolve pronoun references and implicit context, improving retrieval accuracy on multi-turn conversations.
- Layered Fallback Architecture: A three-tier fallback system routes queries: first to the vector database, then to conversation history (using embedding-based relevance scoring), and finally to a defined “no result” response, preventing the model from generating unsupported answers.
- Advanced Prompt Engineering with LLM Judge: A self-evaluating prompt design embeds an LLM feedback layer (using a §-delimited format) that silently scores its own responses for quality, powering continuous improvement without user-visible overhead.
- Self-Hosted Infrastructure: Weaviate, MongoDB, and Redis are all properly containeraized and self hosted available in both Docker Container and an Independent Service system. It gives the client full control over data residency and eliminates external vector database licensing costs.
- Embeddable Chat Widget: A separate React-based chat widget (orange-agent-chat) built as a standalone embeddable component allows integration into existing internal portals via iframe without requiring any frontend rebuild.
- Admin Dashboard: A full React + TypeScript admin panel provides document management (Data Bank), live conversation monitoring, query log analytics, typeahead/suggestion management, response caching controls, and performance dashboards, enabling non-technical administrators to operate the platform independently.
Measurable Results
Team Involvement
| Resources | Count |
|---|---|
| Backend Developers | 3 |
| Frontend Developers | 1 |
Core Features of the Software
Hybrid RAG Query Engine
Combines Weaviate’s semantic vector search with BM25 keyword scoring using a tunable alpha parameter. This hybrid approach ensures highly accurate retrieval for both conceptual queries and telecom-specific terminology.
Automated Document Ingestion Pipeline
Processes PDFs and mixed-format documents through intelligent chunking and OpenAI-powered embeddings. Automatically syncs content to the Weaviate vector database with built-in deduplication and version control.
Context-Aware Query Rewriting
Enhances user queries by incorporating recent conversation context. Rewrites ambiguous or follow-up questions to improve retrieval accuracy without requiring users to restate queries.
Multi-Layer Fallback System
Implements a three-tier response logic: vector database → conversation history → predefined fallback. This layered approach minimizes hallucinations while maintaining consistent conversational flow.
Persistent Conversation Management
Stores threaded conversations in MongoDB with per-user session tracking. Automatically generates conversation titles and summarizes context to support seamless multi-turn interactions.
LLM Self-Evaluation (Judge Layer)
Integrates an internal evaluation mechanism that scores each response using structured prompts. Enables continuous quality monitoring and data-driven optimization over time.
Intelligent Response Caching
Utilizes Redis-based caching for frequently asked queries. Reduces redundant LLM calls and significantly improves response time under high usage.
Guided Query Suggestions
Provides admin-configurable typeahead suggestions within the chat interface. Helps users ask better questions and reduces irrelevant or out-of-scope queries.
Enterprise Admin Dashboard
A comprehensive React + Ant Design dashboard for system management, including document control (Data Bank), live chat monitoring, analytics via Recharts, cache management, and configuration settings.
Embeddable Chat Widget
A self-contained React chat component deployable as an iframe embed into any internal portal, enabling rollout across existing tools without frontend integration overhead.
Development Timeline
Project Start Time
May 2017
Project End Time
November 2022
Development Phases
Proof of Concept (POC):
2 months
Architecture & Design:
1 Month
Core Development
6 Months
Iteration & Optimization:
3 Months
Testing, QA & Deployment:
2 Months
Future Prospects
- Enhanced retrieval algorithms with re-ranking layers for further precision improvements
- PostgreSQL and enterprise platform connectors for broader data source integration
- Telecom-specific language model fine-tuning on domain vocabulary
- Interface optimization with advanced analytics and usage reporting
- Expanded document format support beyond PDFs










Ready to Build Your Own Enterprise AI Knowledge Layer?



