For the complete documentation index, see llms.txt. Markdown variants are available by appending .md to any URL or sending an Accept: text/markdown header. An agent skill is available at /.well-known/agent-skills/site-skill.md.
319
Sponsor

Chat with PDF

Indexes a PDF into a vector store and answers questions over it with page-cited retrieval and stratified sampling.

Output will stream here when you run the agent.

Summary

The Chat with PDF Agent lets you ask questions about a PDF and get answers grounded in the document, with a page citation for every claim. It indexes the PDF into a vector store, retrieves only the relevant chunks per question using stratified sampling across page ranges, and can also generate comprehension quizzes from real passages. Reach for it to turn manuals, papers, and reports into something you can query.

Installation

$ pnpm dlx shadcn@latest add @agentcn/mastra/chat-with-pdf

Composition

├── config.ts              # Agent definition (model + config)
├── instructions.md        # Detailed assistant instructions
├── memory.ts              # Memory instance for conversation history
├── lib/
│   └── vector-store.ts    # LibSQLVector store + index management
├── tools/
│   ├── list_documents.ts  # List all indexed PDF documents
│   └── query_pdf.ts       # Query with stratified page sampling
└── workflows/
    └── index_pdf.ts       # 3-step workflow: download → chunk → embed

Customization

  • Swap the vector store. lib/vector-store.ts uses @mastra/libsql — replace it with Pinecone, Qdrant, Chroma, or pgvector by swapping the vector store import.
  • Tune chunking. Adjust the splitIntoChunks step's maxSize and overlap in workflows/index_pdf.ts.
  • Swap the embedding model. Change the model in tools and workflows from openai/text-embedding-3-small.
  • Add quizzes. The instructions already support quiz generation from retrieved passages — extend them with your preferred format.
  • Add more tools. Create additional tools in tools/ — they'll be auto-discovered.