Case Study

Tiger Agent: Enterprise RAG Knowledge Assistant for Telecom

Tiger Agent, an Enterprise RAG chatbot interface showing AI-powered document retrieval, 98.4% success rate, 86.8% user satisfaction, and 42,500+ requests handled - built by Vivasoft
Client

Bangladesh’s Leading Telecom Company

region-iconIndustry
region-icon Region

Bangladesh

Overview

Bangladesh's largest telecommunications operator manages a vast and constantly evolving library of internal documentation, from product policies and technical manuals to operational procedures; spread across formats and silos with no unified access layer. Support and operations teams had no efficient way to retrieve precise answers from this documentation, leading to slow resolution times and inconsistent responses.

Vivasoft designed and built Tiger Agent, a production-grade Retrieval-Augmented Generation (RAG) platform that transforms unstructured enterprise documents into an AI-powered knowledge assistant. The system ingests PDFs and diverse document formats, converts them into vector embeddings stored in a dedicated Weaviate vector database, and answers natural language queries by combining semantic understanding with keyword precision through hybrid search.

The result is an enterprise assistant capable of delivering context-grounded answers with 92–96% retrieval precision and an average response time of 6–10 seconds — with complete data sovereignty, no per-seat licensing costs, and a security posture built for a regulated telecommunications environment.

Technologies Used

Python

FastAPI

OpenAI icon

OpenAI

MongoDB

Redis

React

TypeScript

GroupCreated with Sketch.

Ant Design

file_type_tailwind

Tailwind CSS

Challenges

  1. Unstructured, Heterogeneous Documentation: Internal knowledge was spread across PDFs, reports, and documents in inconsistent formats with no standardized structure, making automated extraction and reliable chunking difficult.

  2. Retrieval Consistency and Precision: Pure semantic search produced inconsistent results on domain-specific telecom terminology, where exact keyword matches matter as much as conceptual relevance.

  3. LLM Calibration for Grounded Responses: Out-of-the-box LLM behavior required careful prompt engineering and temperature tuning to prevent hallucinations and ensure responses stayed grounded in source documentation.

  4. Query Classification and Fallback Handling: The system needed to correctly distinguish between queries it could answer from the knowledge base, queries answerable from conversation context, and queries that genuinely had no matching information.

  5. Data Sovereignty and Security: As a telecom operator, the client required all data to remain within a controlled infrastructure with no dependency on third-party hosted vector services, and robust access control throughout.

  6. Enterprise Operability: Beyond the chatbot itself, the client needed tools to manage documents, monitor usage, review conversation history, cache frequent responses, and track system performance, all without developer intervention.

Solutions

  1. Hybrid Search with Weaviate: Implemented a hybrid retrieval pipeline combining vector-based semantic search with keyword (BM25) scoring, tuned via configurable alpha weighting, to handle both conceptual queries and precise term lookups across telecom-specific content.

  2. Pattern-Based Document Processing Pipeline: Built an automated ingestion pipeline which initially dependent on LangChain but eventually shifted to a customized approach and also PyPDF2 that parses, cleans, and chunks documents into semantically coherent units, generates OpenAI embeddings, and loads them into Weaviate with deduplication logic to prevent redundant entries.

  3. Query Rewriting and Contextual Rephrasing: Before retrieval, user queries are rephrased using conversation history to resolve pronoun references and implicit context, improving retrieval accuracy on multi-turn conversations.

  4. Layered Fallback Architecture: A three-tier fallback system routes queries: first to the vector database, then to conversation history (using embedding-based relevance scoring), and finally to a defined “no result” response, preventing the model from generating unsupported answers.

  5. Advanced Prompt Engineering with LLM Judge: A self-evaluating prompt design embeds an LLM feedback layer (using a §-delimited format) that silently scores its own responses for quality, powering continuous improvement without user-visible overhead.

  6. Self-Hosted Infrastructure: Weaviate, MongoDB, and Redis are all properly containeraized and self hosted available in both Docker Container and an Independent Service system. It gives the client full control over data residency and eliminates external vector database licensing costs.

  7. Embeddable Chat Widget: A separate React-based chat widget (orange-agent-chat) built as a standalone embeddable component allows integration into existing internal portals via iframe without requiring any frontend rebuild.

  8. Admin Dashboard: A full React + TypeScript admin panel provides document management (Data Bank), live conversation monitoring, query log analytics, typeahead/suggestion management, response caching controls, and performance dashboards, enabling non-technical administrators to operate the platform independently.

Measurable Results

92–96% retrieval precision across production queries
4-8 seconds average end-to-end response time (without caching)
100% data sovereignty- all embeddings, documents, and conversation logs self-hosted with zero third-party cloud vector dependency
Eliminated per-seat licensing costs compared to commercial knowledge management alternatives
Multi-format document support PDFs and diverse file types processed through a single automated ingestion pipeline
Zero-touch document management operations team can upload, update, and retire documents via admin dashboard without engineering involvement

Team Involvement

ResourcesCount
Backend Developers3
Frontend Developers1

Core Features of the Software

Hybrid RAG Query Engine

Combines Weaviate’s semantic vector search with BM25 keyword scoring using a tunable alpha parameter. This hybrid approach ensures highly accurate retrieval for both conceptual queries and telecom-specific terminology.

Automated Document Ingestion Pipeline

Processes PDFs and mixed-format documents through intelligent chunking and OpenAI-powered embeddings. Automatically syncs content to the Weaviate vector database with built-in deduplication and version control.

Context-Aware Query Rewriting

Enhances user queries by incorporating recent conversation context. Rewrites ambiguous or follow-up questions to improve retrieval accuracy without requiring users to restate queries.

Multi-Layer Fallback System

Implements a three-tier response logic: vector database → conversation history → predefined fallback. This layered approach minimizes hallucinations while maintaining consistent conversational flow.

Persistent Conversation Management

Stores threaded conversations in MongoDB with per-user session tracking. Automatically generates conversation titles and summarizes context to support seamless multi-turn interactions.

LLM Self-Evaluation (Judge Layer)

Integrates an internal evaluation mechanism that scores each response using structured prompts. Enables continuous quality monitoring and data-driven optimization over time.

Intelligent Response Caching

Utilizes Redis-based caching for frequently asked queries. Reduces redundant LLM calls and significantly improves response time under high usage.

Guided Query Suggestions

Provides admin-configurable typeahead suggestions within the chat interface. Helps users ask better questions and reduces irrelevant or out-of-scope queries.

Enterprise Admin Dashboard

A comprehensive React + Ant Design dashboard for system management, including document control (Data Bank), live chat monitoring, analytics via Recharts, cache management, and configuration settings.

Embeddable Chat Widget

A self-contained React chat component deployable as an iframe embed into any internal portal, enabling rollout across existing tools without frontend integration overhead.

Development Timeline

Project Start Time

May 2017

1

Project End Time

November 2022 

2

Development Phases

Proof of Concept (POC):

2 months

1

Architecture & Design:

1 Month

2

Core Development

6 Months

3

Iteration & Optimization:

3 Months

4

Testing, QA & Deployment:

2 Months

5

Future Prospects

  • Enhanced retrieval algorithms with re-ranking layers for further precision improvements
  • PostgreSQL and enterprise platform connectors for broader data source integration
  • Telecom-specific language model fine-tuning on domain vocabulary
  • Interface optimization with advanced analytics and usage reporting
  • Expanded document format support beyond PDFs
500+ companies rely on our top 1% talent to scale their dev teams.
Azerion
NumberSkills
Klikit-logo
Flarie
Stickler
Dunite
Mask group 1
Goava
ROO
Talrock

Ready to Build Your Own Enterprise AI Knowledge Layer?

Tiger Agent demonstrates how purpose-built RAG infrastructure, designed around your data, your security requirements, and your team's operational workflow- outperforms off-the-shelf AI tools for enterprise knowledge management. Vivasoft's AI Lab specializes in building production-grade AI systems that deliver measurable accuracy with full data control.
Potential Developer
Tech Stack
0 +
Offshore-Development-at-Vivasoft (1)
Vivasoft - Career Opportunity
Vivasoft - Career Opportunity