AI Systems Project
SpecPilot RAG
SpecPilot RAG is the kind of assistant I would build for long engineering manuals, silicon bring-up notes, and platform runbooks where the real problem is not "chat", it is finding the right source quickly and answering without drifting away from the evidence. The working version behind this page uses lexical retrieval, a Hugging Face reranker in PyTorch, and a lightweight local language model to return citation-backed answers over a technical mini-corpus.
Why this project
Generic chat interfaces usually break down on dense technical documentation because the answer quality depends more on retrieval quality than on clever wording. This project is shaped around that reality: stronger candidate retrieval, an explicit reranking stage, grounded generation, and clearer citations. It is especially relevant for product specs, API references, EDA flow notes, and bring-up documents that engineers revisit under time pressure.
Model stack
- Lexical retrieval over chunked technical documents
- PyTorch cross-encoder reranker for relevance refinement
- Hugging Face generation constrained by retrieved evidence
- Evaluation loop for citation quality and factual drift
System architecture
Chunk large PDFs, runbooks, and wiki pages into retrieval-sized passages while preserving section headers, product names, and source metadata for downstream citation.
Use lexical retrieval over chunked technical notes to pull a focused candidate set before the more expensive ranking stage runs.
Use a cross-encoder reranker in PyTorch to rescore the top retrieved passages and push the most grounded evidence into the prompt window.
Send only the best evidence into the answer stage, require citations, and measure whether the response stays inside the retrieved material instead of inventing details.
Evaluation goals
- Citation hit-rate on long technical questions
- Lower hallucination rate on part numbers and register names
- Latency budget below 2.5s for top-k retrieval + reranking
- Answer helpfulness measured against manually written reference responses