โ† Back to Projects

Portfolio Chatbot โ€” RAG Assistant

Portfolio RAG Chatbot
Python RAG LangChain LLMs FAISS Embeddings Groq Gradio NLP

Brief

A live, deployed AI assistant that answers questions about my background, projects, skills, and CV. It uses retrieval-augmented generation (RAG): rather than relying on a model's training memory, it retrieves the most relevant passages from my actual documents and feeds them to the LLM as context. Ask it anything โ€” it answers from source, not from guesswork.

What It Does

The chatbot is embedded directly on this portfolio and deployed as a public Hugging Face Space. Visitors can ask questions like "What is his latest job?", "What ML frameworks does he know?", or "Tell me about the IGARSS paper" โ€” and get accurate, grounded answers drawn from my CV and website content.

  • Answers questions about experience, projects, skills, and education
  • Grounds every answer in retrieved source documents โ€” no hallucination
  • Maintains short-term conversation memory across follow-up questions
  • Falls back gracefully to extractive synthesis when the API is unavailable

Data Sources

Three document collections were ingested and indexed at build time:

  • CV PDF โ€” full rรฉsumรฉ with experience, education, and skills
  • Portfolio website HTML โ€” all project pages from lpoly.github.io, scraped and parsed
  • FAQ text file โ€” manually written Q&A pairs covering common recruiter questions

Together these produce 108 chunks across 92 source documents, covering the full breadth of my professional profile.

Pipeline

RAG pipeline diagram

The pipeline has two phases. At index time, documents are chunked, embedded with BAAI/bge-m3, and stored in a FAISS vector index. At query time, the user's question is embedded with the same model, the closest chunks are retrieved by cosine similarity, re-ranked with a combination of semantic score and lexical overlap, and the top results are passed as context to the LLM.

Stack

  • Embedding model: BAAI/bge-m3 โ€” 570M-parameter multilingual model, robust to short and noisy queries
  • Vector store: FAISS (CPU) โ€” in-memory index, file-backed, sub-millisecond retrieval at this scale
  • Retrieval: LangChain FAISS wrapper ยท cosine similarity ยท multi-factor re-ranking (semantic + lexical + section bias)
  • LLM: Groq API ยท llama-3.3-70b-versatile โ€” fast, free inference; answers grounded in retrieved context only
  • Frontend: Gradio chat UI deployed to Hugging Face Spaces, with a custom JavaScript widget embedded on this site
  • Ingest: custom ingest.py script โ€” parses PDF (PyPDF), crawls HTML (BeautifulSoup), chunks and indexes with metadata