Semantic FAQ Engine | James Murray

James Murray has developed a semantic FAQ engine that synthesizes answers from over 2,400 FAQ embeddings to provide contextualized responses. This engine uses advanced AI techniques to extract and generate relevant answers based on user queries.

The system achieves 96% answer coverage and reduces support ticket volume by 41% in production. It uses multi-hop retrieval to chain related FAQs into comprehensive, cited responses with source links.

Ideal for customer support portals, internal wikis, and self-service knowledge bases.

Key Features

  • Contextualized Answers: Synthesizes from multiple FAQs with citation footnotes.
  • Scalable Embeddings: 2,400+ entries with sub-20ms retrieval.
  • AI-Driven Answer Generation: Uses Llama 3 8B for coherent synthesis.
  • Multi-Hop Reasoning: Chains 2-3 FAQs for complex queries.
  • Feedback Loop: User thumbs-up/down retrains reranker weekly.
  • Widget Embed: Drop-in JS widget for any CMS.

System Design & Architecture

The engine uses vector embeddings to create a semantic map of answers. Queries are classified, retrieved, reranked, and synthesized in a 4-stage pipeline with observability at each step.

Technical Stack

  • Embeddings: all-MiniLM-L6-v2
  • Vector DB: Chroma with IVF-PQ indexing
  • LLM: Llama 3 8B via vLLM
  • Frontend: PHP + Alpine.js widget

Related Projects

Explore other related projects: