CLI Reference

PaperRAG is invoked via the paperrag command. Running it without a subcommand starts the interactive REPL.

Global Options (REPL mode)

paperrag [OPTIONS]

Option

Short

Description

--index-dir PATH

-i

Index directory (required unless set in .paperragrc)

--input-dir PATH

-d

PDF directory

--model NAME

-m

LLM model name. Ollama names (for example qwen2.5:1.5b) use Ollama. Local .gguf paths and HuggingFace repo IDs use llama.cpp via llama-server.

--topk N

-k

Number of chunks to retrieve (default: 3)

--threshold FLOAT

-t

Similarity score threshold (0.0-1.0)

--temperature FLOAT

LLM temperature (0.0-2.0)

--max-tokens N

Maximum response tokens (default: 256)

--ctx-size N

LLM context window size (default: 2048)

--system-prompt TEXT

Override the default system prompt

--version

Show version and license

--help

-h

Show help

paperrag index

Index PDF files into the FAISS vector store.

paperrag index [OPTIONS]

Option

Short

Description

--input-dir PATH

-d

PDF directory (default: ~/Documents/Mendeley Desktop)

--index-dir PATH

-i

Index directory (default: <input-dir>/.paperrag-index)

--force

-f

Force full re-index

--checkpoint-interval N

-c

Save index every N PDFs (0 to disable)

--workers N

-w

Number of parallel workers (0 = auto-detect)

--ocr MODE

OCR mode: auto (default), always, never

--manifest PATH

CSV manifest file with paper metadata

Examples

# Index with default settings
paperrag index --input-dir ~/papers

# Index a single PDF into ~/papers/.paperrag-index
paperrag index --input-dir ~/papers/paper.pdf

# Force re-index with 4 workers, OCR disabled
paperrag index --input-dir ~/papers --force --workers 4 --ocr never

# Index with a metadata manifest
paperrag index --input-dir ~/papers --manifest papers.csv

paperrag review

Index a PDF file or directory, then drop directly into the interactive review session.

paperrag review INPUT_PATH [OPTIONS]

Option

Short

Description

--index-dir PATH

-i

Index directory. If omitted, auto-derived from the input path.

--topk N

-k

Number of chunks to retrieve (default: current config)

--threshold FLOAT

-t

Similarity score threshold (0.0-1.0)

--temperature FLOAT

LLM temperature

--max-tokens N

Maximum response tokens

--ctx-size N

LLM context window size

--system-prompt TEXT

Override the default system prompt

--model NAME

-m

LLM model name

Default index location:

  • Directory input: <input-dir>/.paperrag-index

  • Single PDF input: <pdf-parent-dir>/.paperrag-index

Single-PDF behaviour: when a single PDF is passed, the REPL starts with queries automatically focused on that paper. If the index contains other papers (e.g. previously indexed alongside it), a one-line hint is shown:

Auto-focused on 'paper.pdf'
2 other paper(s) also indexed — /focus list to browse, /focus to search all

Use /focus (no argument) at any time to remove the filter and search across all indexed papers.

Examples:

# Review one paper — REPL auto-focuses on it
paperrag review ~/papers/paper.pdf

# Review a whole directory and increase answer length
paperrag review ~/papers --max-tokens 512

# Review using a custom index location
paperrag review ~/papers/paper.pdf --index-dir /tmp/paperrag-review

paperrag query

Run a one-off query against the indexed papers.

Use --no-llm to stop after retrieval and print chunk-level results directly.

paperrag query QUESTION [OPTIONS]

Option

Short

Description

--index-dir PATH

-i

Index directory (required)

--input-dir PATH

-d

PDF directory

--top-k N

-k

Number of results (default: 3)

--threshold FLOAT

-t

Similarity score threshold (0.0-1.0)

--no-llm

Skip answer generation and print raw retrieval results

--temperature FLOAT

LLM temperature (0.0-2.0)

--max-tokens N

Maximum response tokens (default: 256)

--ctx-size N

LLM context window size (default: 2048)

--system-prompt TEXT

Override the default system prompt

--model NAME

-m

LLM model name

Examples

# Query with Ollama
paperrag query "what is speech chain?" --index-dir ~/papers -m qwen3:1.7b

# Query with a local GGUF model through llama.cpp
paperrag query "summarize the paper" --index-dir ~/papers -m ./models/qwen3-1.7b.gguf

# Query with a HuggingFace GGUF repo through llama.cpp
paperrag query "summarize the paper" --index-dir ~/papers -m Qwen/Qwen3-1.7B-GGUF

# Retrieval-only mode (no LLM call)
paperrag query "attention mechanism" --index-dir ~/papers --no-llm

# Adjust retrieval parameters
paperrag query "attention mechanism" --index-dir ~/papers -k 10 -t 0.3 --max-tokens 512

paperrag evaluate

Evaluate retrieval quality using a JSONL benchmark file.

paperrag evaluate BENCHMARK_FILE [OPTIONS]

Option

Short

Description

--top-k N

-k

Number of results (default: 3)

--input-dir PATH

-d

PDF directory

--index-dir PATH

-i

Index directory

The benchmark file should be JSONL with each line containing:

{"question": "...", "relevant_documents": ["path1.pdf", "path2.pdf"]}

REPL Slash Commands

Inside the REPL, all control commands are slash-prefixed:

Command

Description

<any text>

Query the indexed papers

/index

Re-index the current PDF directory or file

/index <path>

Re-index a different PDF file or directory

/focus <name>

Focus queries on a specific paper (use list to see options)

/topk <n>

Set top-k retrieval

/threshold <n>

Set minimum similarity score

/temperature <n>

Set LLM temperature

/max-tokens <n>

Set maximum response tokens

/ctx-size <n>

Set LLM context window size

/prompt <text>

Set the system prompt

/no-llm

Toggle retrieval-only mode for subsequent queries

/no-llm on|off

Explicitly enable or disable retrieval-only mode

/model <name>

Switch the active LLM backend/model

/config

Show the current effective configuration

/rc

Show loaded .paperragrc files and values

/help

Show REPL help

/exit or /quit

Exit the REPL

When /no-llm mode is active, queries stop after retrieval and print scored chunk snippets instead of generating an answer. Use /no-llm on or /no-llm off when you want an explicit state change instead of a toggle.