Skip to content

A Locally‑Hosted AI Document Summarizer & Q&A Backend

License

Notifications You must be signed in to change notification settings

sunnypatell/smartdoc-ai

Repository files navigation

SmartDoc AI 🤖
A self-hosted AI document summarizer and Q&A system that processes documents locally - no API keys needed.

SmartDoc AI Demo SmartDoc AI Demo

Features

  • 🔎 Document upload and text extraction (PDF/TXT)
  • 📝 Automatic document summarization
  • ❓ Question answering system
  • 🔍 Semantic search using FAISS
  • 💻 Local processing with no external APIs
  • ⚡ FastAPI backend ready for React frontend

Upload Demo

Tech Stack

  • Frontend: React/Next.js, TailwindCSS
  • Backend: FastAPI
  • AI Models:
    • Summarization: t5-small (~300MB)
      summarizer = pipeline(
          "summarization",
          model="t5-small",
          tokenizer="t5-small",
          framework="pt"
      )
    • Q&A: distilbert-base-uncased-distilled-squad (~250MB)
      qa_model = pipeline(
          "question-answering", 
          model="distilbert-base-uncased-distilled-squad",
          framework="pt"
      )
    • Embeddings: all-MiniLM-L6-v2 (~90MB)
      embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
    • Vector Search: FAISS

Total model size: ~640MB

Setup & Installation

Windows Environment Setup (Backend)

  1. Create virtual environment:
python -m venv venv
  1. Activate virtual environment:
venv\Scripts\activate
  1. Install required packages:
pip install fastapi uvicorn python-multipart PyPDF2 transformers sentence-transformers faiss-cpu torch numpy

or run pip install -r requirements.txt to install all the dependencies

Run the server

uvicorn smartdoc_backend:app --reload

Server will be available at http://127.0.0.1:8000. If by chance it isn't that IP, check your CLI it will display the available IP it chose and port

Chute (end) the virtual environment using deactivate command

Frontend Setup (Next.js)

  1. Navigate to frontend directory: cd .\frontend\

  2. Install Node.js dependencies: npm install

  3. Start development server: npm run dev

Frontend available at: http://localhost:3000

Model Storage

Models are cached in:

  • Windows: C:\Users\<YourUsername>\.cache\huggingface\hub
  • Linux/MacOS: ~/.cache/huggingface/hub

API Endpoints

Document Management

  • POST /upload - Upload PDF/text documents
  • GET /documents - List all documents
  • GET /document/{doc_id} - Get document metadata

AI Features

  • GET /document/{doc_id}/summary - Generate document summary
  • POST /document/{doc_id}/query - Ask questions about document content
  • GET /document/{doc_id}/chunks - Get document chunks (debug)

Usage Example without frontend

import requests

# Upload a document
files = {'file': open('document.pdf', 'rb')}
response = requests.post('http://127.0.0.1:8000/upload', files=files)
doc_id = response.json()['doc_id']

# Get a summary
summary = requests.get(f'http://127.0.0.1:8000/document/{doc_id}/summary')

# Ask a question
query = {'query': 'What is this document about?'}
answer = requests.post(f'http://127.0.0.1:8000/document/{doc_id}/query', json=query)

License

MIT License - See LICENSE for more details.

About

A Locally‑Hosted AI Document Summarizer & Q&A Backend

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published