July 20, 2025

GithubBot

GithubBot is a powerful AI framework designed to revolutionize how developers interact with codebases. It automatically “learns” an entire GitHub repository—including all its code and documentation—and answers any questions about it in natural language through an intelligent chatbot, from “What does this function do?” to “How do I implement a new feature?”.

🚀 Core Features

🤖 Intelligent Code Q&A: Provides precise, context-aware code explanations and suggestions based on Retrieval-Augmented Generation (RAG).
⚡️ Fully Automated Processing: Simply provide a GitHub repository URL to automatically clone, parse, chunk, vectorize, and index the code.
🔌 Highly Extensible: Easily swap or extend LLMs, embedding models, and vector databases. Supports various models like OpenAI, Azure, Cohere, and HuggingFace.
🔍 Hybrid Search: Combines vector search with BM25 keyword search to ensure optimal context retrieval for various types of queries.
⚙️ Asynchronous Task Handling: Uses Celery and Redis to manage time-consuming repository indexing tasks, ensuring API responsiveness and stability.
🐳 One-Click Deployment: Comes with a complete Docker Compose setup, allowing you to launch all services (API, Worker, databases, etc.) with a single command.

🛠️ Tech Stack

Backend: FastAPI, Python 3.10+
AI / RAG: LangChain, OpenAI, Cohere, HuggingFace (extendable)
Database: PostgreSQL (metadata), ChromaDB (vector storage)
Task Queue: Celery, Redis
Containerization: Docker, Docker Compose
Data Validation: Pydantic

🚀 Quick Start

You can get GithubBot up and running in minutes with Docker.

1. Prerequisites

Docker: Install Docker
Docker Compose: Usually included with Docker Desktop.
Git: To clone this project.

2. Clone the Project

git clone https://github.com/oGYCo/GithubBot.git
cd GithubBot

3. Configure Environment

The project uses a .env file to manage sensitive information and configurations. Please note: The project includes a .env.example file. You need to create your own .env file from it.

cp .env.example .env

Then, edit the .env file and add at least your OpenAI API key:

# .env

# --- LLM and Embedding Model API Keys ---
# At least one model key is required
OPENAI_API_KEY="sk-..."
# AZURE_OPENAI_API_KEY=
# ANTHROPIC_API_KEY=
# ... other API keys

4. Launch Services

Build and start all services with a single command using Docker Compose:

docker-compose up --build -d

This command will start the API service, Celery worker, PostgreSQL, Redis, and ChromaDB.

5. Check Status

Wait a moment for the services to initialize, then check if all containers are running correctly:

docker-compose ps

You should see the status of all services as running or healthy.

📖 API Usage Example

Once the services are running, the API will be available at http://localhost:8000. You can access the interactive API documentation (Swagger UI) at http://localhost:8000/docs.

1. Index a New Repository

Send a POST request to the following endpoint to start analyzing a repository. This is an asynchronous operation, and the API will immediately return a task ID.

URL: /api/v1/repositories/
Method: POST
Body:

{
  "repo_url": "https://github.com/tiangolo/fastapi"
}

Example (using cURL):

curl -X 'POST' \
  'http://localhost:8000/api/v1/repositories/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "repo_url": "https://github.com/tiangolo/fastapi"
}'

2. Check Analysis Status

Use the session_id returned from the previous step to check the analysis progress.

URL: /api/v1/repositories/{session_id}/status
Method: GET

3. Chat with the Repository

Once the repository status changes to COMPLETED, you can start asking questions.

URL: /api/v1/repositories/{session_id}/query
Method: POST
Body:

{
  "query": "How to handle CORS in FastAPI?"
}