API Reference

Complete API documentation for integrating with SageAI. Build custom interfaces, automation scripts, or integrate with your existing research tools.

Quick Facts

  • Protocol: REST API over HTTP
  • Format: JSON request/response
  • Default Port: 8000
  • API Version: v1

Authentication

Current Version: SageAI runs locally without authentication. All endpoints are accessible without credentials. For production deployments, consider adding authentication via a reverse proxy.

Base URL

All API endpoints are prefixed with:

http://localhost:8000/api/v1

💡 If you change the port in Docker Compose, update the base URL accordingly.

Query Papers

POST
POST /api/v1/query

Submit a natural language question and receive an AI-generated answer with citations from your uploaded papers.

Request Body

{
  "question": "What methodology was used in the transformer paper?",
  "top_k": 5,
  "paper_ids": [1, 3]  // optional: limit to specific papers
}

Parameters

Parameter Type Description
question *required string The question to ask about your papers
top_k integer Number of relevant chunks to retrieve (1-10, default: 5)
paper_ids array Limit search to specific paper IDs (omit for all papers)

Response

{
  "answer": "The transformer paper uses a self-attention mechanism...",
  "citations": [
    {
      "paper_title": "Attention is All You Need",
      "section": "Methodology",
      "page": 3,
      "relevance_score": 0.89,
      "chunk_text": "..."
    }
  ],
  "sources_used": ["paper3_nlp_transformers.pdf"],
  "confidence": 0.85,
  "query_time_ms": 1250,
  "cached": false
}

List All Papers

GET
GET /api/v1/papers

Retrieve a list of all uploaded papers with metadata and processing status.

Response

{
  "papers": [
    {
      "id": 1,
      "title": "Attention is All You Need",
      "filename": "transformer_paper.pdf",
      "status": "indexed",
      "chunk_count": 42,
      "vector_count": 42,
      "upload_date": "2025-10-31T10:30:00Z",
      "indexed_date": "2025-10-31T10:31:15Z",
      "file_size_mb": 2.4
    }
  ],
  "total": 1
}

Upload Paper

POST
POST /api/v1/papers/upload

Upload a PDF research paper. The embedder service will extract text, chunk it, and create vector embeddings for semantic search.

Request

Content-Type: multipart/form-data
Form Field: file (PDF file)

Example (cURL)

curl -X POST http://localhost:8000/api/v1/papers/upload \
  -F "file=@/path/to/paper.pdf"

Response

{
  "paper_id": 5,
  "title": "BERT: Pre-training of Deep Bidirectional Transformers",
  "filename": "bert_paper.pdf",
  "status": "processing",
  "message": "Paper uploaded successfully. Processing in background."
}

Delete Paper

DELETE
DELETE /api/v1/papers/:id

Remove a paper and all associated data (chunks, vectors, metadata).

Response

{
  "message": "Paper deleted successfully",
  "paper_id": 5
}

Paper Statistics

GET
GET /api/v1/papers/:id/stats

Get detailed statistics for a specific paper.

Response

{
  "paper_id": 1,
  "title": "Attention is All You Need",
  "filename": "transformer_paper.pdf",
  "status": "indexed",
  "chunk_count": 42,
  "vector_count": 42,
  "upload_date": "2025-10-31T10:30:00Z",
  "indexed_date": "2025-10-31T10:31:15Z",
  "processing_time_ms": 75000,
  "file_size_mb": 2.4,
  "sections": ["Abstract", "Introduction", "Methods", "Results", "Conclusion"],
  "query_count": 87  // how many times this paper was referenced
}

Query History

GET
GET /api/v1/queries/history?limit=20&offset=0&include_answer=true

Retrieve paginated query history with optional full answers.

Query Parameters

Parameter Type Description
limit integer Results per page (default: 20)
offset integer Skip N results (default: 0)
include_answer boolean Include full answer text (default: false)

Response

{
  "queries": [
    {
      "query_id": "q_123",
      "question": "What is self-attention?",
      "timestamp": "2025-10-31T14:22:00Z",
      "confidence": 0.92,
      "rating": 5,
      "query_time_ms": 1200,
      "answer": "Self-attention is..."  // only if include_answer=true
    }
  ],
  "total": 150,
  "limit": 20,
  "offset": 0
}

Rate Answer

PATCH
PATCH /api/v1/queries/:id/rating

Submit a user rating (1-5 stars) for a query response.

Request Body

{
  "rating": 4
}

Response

{
  "message": "Rating saved successfully",
  "query_id": "q_123",
  "rating": 4
}

Popular Analytics

GET
GET /api/v1/analytics/popular

Get aggregated data on most queried topics and most referenced papers.

Response

{
  "top_questions": [
    { "question": "What is self-attention?", "count": 15 },
    { "question": "How does BERT work?", "count": 12 }
  ],
  "top_papers": [
    { "paper_id": 1, "title": "Attention is All You Need", "reference_count": 45 },
    { "paper_id": 3, "title": "BERT Paper", "reference_count": 38 }
  ]
}

Health Checks

GET

Liveness Check

GET /api/v1/health/healthz

Returns 200 if the API server is alive.

{ "status": "ok" }

Readiness Check

GET /api/v1/health/readyz

Returns 200 if the API is ready to serve requests. Checks all dependencies.

{
  "status": "ready",
  "dependencies": {
    "mongo": { "status": "healthy", "response_time_ms": 5 },
    "redis": { "status": "healthy", "response_time_ms": 2 },
    "qdrant": { "status": "healthy", "response_time_ms": 10 },
    "ollama": { "status": "healthy", "response_time_ms": 50 }
  }
}

Error Codes

Code Status Description
400 Bad Request Invalid parameters or malformed request
404 Not Found Paper or query ID doesn't exist
413 Payload Too Large PDF file exceeds size limit
429 Too Many Requests Rate limit exceeded
500 Internal Server Error Something went wrong on the server
503 Service Unavailable Dependent service (Ollama, Qdrant, etc.) is down

Error Response Format

{
  "error": "Invalid top_k value",
  "message": "top_k must be between 1 and 10",
  "code": 400
}

Rate Limits

Default Limits

  • Window: 60 seconds
  • Max Requests: 120 per window
  • Configurable: Via RATE_LIMIT_* environment variables

When rate limit is exceeded, you'll receive:

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again in 30 seconds.",
  "code": 429,
  "retry_after": 30
}

Code Examples

JavaScript (Fetch)

// Query papers
async function queryPapers(question, topK = 5) {
  const response = await fetch('http://localhost:8000/api/v1/query', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ question, top_k: topK })
  });
  
  if (!response.ok) {
    throw new Error(`HTTP error! status: ${response.status}`);
  }
  
  const data = await response.json();
  console.log('Answer:', data.answer);
  console.log('Confidence:', data.confidence);
  return data;
}

// Usage
queryPapers('What is self-attention?')
  .then(result => console.log(result))
  .catch(error => console.error('Error:', error));

Python (Requests)

import requests

BASE_URL = "http://localhost:8000/api/v1"

# Query papers
def query_papers(question, top_k=5, paper_ids=None):
    payload = {
        "question": question,
        "top_k": top_k
    }
    if paper_ids:
        payload["paper_ids"] = paper_ids
    
    response = requests.post(f"{BASE_URL}/query", json=payload)
    response.raise_for_status()
    return response.json()

# Upload paper
def upload_paper(file_path):
    with open(file_path, 'rb') as f:
        files = {'file': f}
        response = requests.post(f"{BASE_URL}/papers/upload", files=files)
        response.raise_for_status()
        return response.json()

# Usage
result = query_papers("What is self-attention?")
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['confidence']}")

cURL

# Query papers
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is self-attention?",
    "top_k": 5
  }'

# Upload paper
curl -X POST http://localhost:8000/api/v1/papers/upload \
  -F "file=@/path/to/paper.pdf"

# Get query history
curl "http://localhost:8000/api/v1/queries/history?limit=10&include_answer=true"

# Health check
curl http://localhost:8000/api/v1/health/readyz

Ready to Integrate?

Start building with SageAI's powerful API. Questions? Check the docs or join our community.