Drop in an interview recording, get a full candidate report in seconds. Produces multi-dimensional scoring across technical depth, communication clarity, and problem solving — with radar charts, behavioral hesitation signals, topic sentiment, and a timestamped interview timeline. Built on a RAG pipeline with FastAPI and React.
The pipeline takes a raw interview audio file and transforms it into a structured JSON report through two sequential AI stages, all orchestrated by a FastAPI backend sitting between the user and the processing layers.
The pipeline takes a raw interview audio file and transforms it into a structured JSON report through two sequential AI stages, orchestrated by a FastAPI backend. Step 1 — Upload: The user uploads an <code>.mp3</code> file through the Next.js frontend, which sends it to the FastAPI backend via a REST call. Step 2 — Transcription (local GPU): FastAPI passes the audio to Open AI Whisper, running locally on a personal GPU. Whisper performs speech-to-text and outputs a raw transcript as a <code>.txt</code> file. Step 3 — Summarization (Groq API): The transcript is sent to LLaMA 3.1 8B via the Groq API, which extracts evaluation metrics — scores, hesitation signals, topic sentiments, timeline segments, and key takeaways. Step 4 — JSON Response: LLaMA returns a structured JSON object with all numbered scores and categorized data ready for rendering. Step 5 — UI Render: FastAPI sends the JSON back to Next.js, which renders the full dashboard — radar charts, progress bars, sentiment chips, and the interview timeline.