rag-ops-knowledge-base

RAG Ops Knowledge Base

CI GitHub Pages License Python Docker

A Retrieval-Augmented Generation (RAG) system for DevOps knowledge management. Ingest runbooks, incident playbooks, and documentation, then query them using natural language via a FastAPI endpoint.

Architecture

┌─────────────┐
│  Documents  │ (Markdown, PDF, Text)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   Chunker   │ (Fixed-size, Semantic)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Bedrock    │ (Titan Embeddings)
│ Embeddings  │
└──────┬──────┘
       │
       ▼
┌──────────────────┐
│ OpenSearch       │ (Vector Store)
│ Serverless       │
└──────┬───────────┘
       │
       ▼
┌─────────────┐
│    Query    │ (Natural Language)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   Bedrock   │ (Claude LLM)
│     LLM     │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Response   │ (Contextual Answer)
└─────────────┘

Features

Prerequisites

Quick Start

Local Development

  1. Clone the repository:
    git clone https://github.com/hammadhaqqani/rag-ops-knowledge-base.git
    cd rag-ops-knowledge-base
    
  2. Install dependencies:
    pip install -r requirements.txt
    
  3. Set environment variables:
    export AWS_REGION=us-east-1
    export OPENSEARCH_COLLECTION_ENDPOINT=https://your-collection.us-east-1.aoss.amazonaws.com
    export BEDROCK_MODEL_ID=anthropic.claude-v2
    
  4. Run the FastAPI application:
    uvicorn app.main:app --reload --port 8000
    
  5. Ingest sample runbooks:
    python scripts/ingest.py --directory data/sample-runbooks
    
  6. Query the knowledge base:
    python scripts/query.py "How do I recover an EC2 instance?"
    

AWS Deployment

  1. Deploy infrastructure:
    cd terraform
    terraform init
    terraform plan
    terraform apply
    
  2. Get outputs:
    terraform output opensearch_endpoint
    terraform output iam_role_arn
    
  3. Configure environment:
    export OPENSEARCH_COLLECTION_ENDPOINT=$(terraform output -raw opensearch_endpoint)
    export AWS_REGION=$(terraform output -raw aws_region)
    
  4. Deploy application (ECS, Lambda, or EC2):
    docker build -t rag-ops-kb .
    docker push <your-ecr-repo>/rag-ops-kb:latest
    

API Documentation

Endpoints

POST /query

Query the knowledge base with natural language.

Request Body:

{
  "query": "How do I troubleshoot high CPU usage?",
  "max_results": 5,
  "min_score": 0.7
}

Response:

{
  "answer": "To troubleshoot high CPU usage, first identify the process...",
  "sources": [
    {
      "document": "high-cpu-troubleshooting.md",
      "chunk_id": "chunk_123",
      "score": 0.89,
      "content": "Check top processes using 'top' or 'htop'..."
    }
  ],
  "query_time_ms": 245
}

POST /ingest

Ingest a document into the knowledge base.

Request Body:

{
  "document_path": "/path/to/runbook.md",
  "metadata": {
    "category": "incident-response",
    "author": "ops-team"
  }
}

Response:

{
  "status": "success",
  "chunks_created": 15,
  "document_id": "doc_abc123"
}

GET /health

Health check endpoint.

Response:

{
  "status": "healthy",
  "opensearch_connected": true,
  "bedrock_available": true
}

Configuration

The application uses environment variables for configuration:

Sample Queries

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Support

If you find this useful, consider buying me a coffee!

Buy Me A Coffee