Run the server

SmartDoc AI 🤖
A self-hosted AI document summarizer and Q&A system that processes documents locally - no API keys needed.

Features

🔎 Document upload and text extraction (PDF/TXT)
📝 Automatic document summarization
❓ Question answering system
🔍 Semantic search using FAISS
💻 Local processing with no external APIs
⚡ FastAPI backend ready for React frontend

Tech Stack

Frontend: React/Next.js, TailwindCSS
Backend: FastAPI

AI Models:

Summarization: t5-small (~300MB)

summarizer = pipeline(
    "summarization",
    model="t5-small",
    tokenizer="t5-small",
    framework="pt"
)

Q&A: distilbert-base-uncased-distilled-squad (~250MB)

qa_model = pipeline(
    "question-answering", 
    model="distilbert-base-uncased-distilled-squad",
    framework="pt"
)

Embeddings: all-MiniLM-L6-v2 (~90MB)

embedding_model = SentenceTransformer("all-MiniLM-L6-v2")

Vector Search: FAISS

Total model size: ~640MB

Setup & Installation

Windows Environment Setup (Backend)

Create virtual environment:

python -m venv venv

Activate virtual environment:

venv\Scripts\activate

Install required packages:

pip install fastapi uvicorn python-multipart PyPDF2 transformers sentence-transformers faiss-cpu torch numpy

or run pip install -r requirements.txt to install all the dependencies

Run the server

uvicorn smartdoc_backend:app --reload

Server will be available at http://127.0.0.1:8000. If by chance it isn't that IP, check your CLI it will display the available IP it chose and port

Chute (end) the virtual environment using deactivate command

Frontend Setup (Next.js)

Navigate to frontend directory: cd .\frontend\
Install Node.js dependencies: npm install
Start development server: npm run dev

Frontend available at: http://localhost:3000

Model Storage

Models are cached in:

Windows: C:\Users\<YourUsername>\.cache\huggingface\hub
Linux/MacOS: ~/.cache/huggingface/hub

API Endpoints

Document Management

POST /upload - Upload PDF/text documents
GET /documents - List all documents
GET /document/{doc_id} - Get document metadata

AI Features

GET /document/{doc_id}/summary - Generate document summary
POST /document/{doc_id}/query - Ask questions about document content
GET /document/{doc_id}/chunks - Get document chunks (debug)

Usage Example without frontend

import requests

# Upload a document
files = {'file': open('document.pdf', 'rb')}
response = requests.post('http://127.0.0.1:8000/upload', files=files)
doc_id = response.json()['doc_id']

# Get a summary
summary = requests.get(f'http://127.0.0.1:8000/document/{doc_id}/summary')

# Ask a question
query = {'query': 'What is this document about?'}
answer = requests.post(f'http://127.0.0.1:8000/document/{doc_id}/query', json=query)

License

MIT License - See LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
demo_pics		demo_pics
frontend		frontend
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
smartdoc_backend.py		smartdoc_backend.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Tech Stack

Setup & Installation

Windows Environment Setup (Backend)

Run the server

Frontend Setup (Next.js)

Model Storage

API Endpoints

Document Management

AI Features

Usage Example without frontend

License

About

Releases

Packages

Languages

License

sunnypatell/smartdoc-ai

Folders and files

Latest commit

History

Repository files navigation

Features

Tech Stack

Setup & Installation

Windows Environment Setup (Backend)

Run the server

Frontend Setup (Next.js)

Model Storage

API Endpoints

Document Management

AI Features

Usage Example without frontend

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages