Your textbook,
rendered into motion.
Mentrax ingests a PDF chapter, plans the lecture with Gemini, choreographs a Manim animation, narrates it with synthesized voice, and streams the finished video back over a live Redis channel — end to end, no human in the loop.
workflow_{user_id} — each finished clip lands over the WebSocket the moment Celery completes it.Six stages, one Celery job
Nothing here is a mock. Every stage below maps to a real module in the backend — the chapter never waits on a human to animate, narrate, or publish it.
Chapter ingestion
A textbook PDF is uploaded with the index page range. PyMuPDF parses the table of contents into a topic → page map, scoped per course and per user.
POST /upload_bookLecture planning
Gemini reads the chapter, the requested mode (deep dive or revise), and the learner's stated level, then drafts a topic-by-topic teaching plan with explicit pacing.
animation_plan.pyScene choreography
Each plan step compiles to Manim scene code — shapes, transforms, and on-screen text are generated programmatically, not hand-animated.
manim_run.pyNarration synthesis
gTTS renders the matching voiceover per scene; audio chunks are timed against the animation so narration and motion land in sync.
text_to_audio_chunks()Async render queue
A Celery worker (pcc.delay) runs the whole chapter job off the request thread, so uploading a 40-page chapter never blocks the API.
celery + redis brokerLive delivery
As each clip finishes, it's pushed onto a per-user Redis stream and pushed to the browser over a raw WebSocket the instant it's ready.
redis.xadd → /ws/Talk to the real backend
This panel calls the deployed FastAPI service directly from your browser — same endpoints a production client would use. Connect a backend via NEXT_PUBLIC_API_URL to take it out of demo mode.
awaiting TOC…
// create a course, then upload a chapter PDF // to extract its table of contents. POST /create_course POST /upload_book POST /start_teaching
An interviewer that actually listens.
Mentrax runs full mock interviews over voice: it speaks with synthesized audio, transcribes your spoken answer with the browser's speech recognition, waits out a 5-second silence to know you're done, and adapts the next question to your resume and the role.
Built like production, not a notebook
- FastAPI
- Pydantic v2
- WebSockets
- python-multipart
- Celery
- Redis (broker + streams)
- asyncio
- Gemini 2.0 Flash
- Manim
- gTTS
- moviepy
- PyMuPDF
- PyPDF2
- FAISS
- Docker
- Gunicorn + Uvicorn
- Cloudinary
- boto3 / S3
- Next.js 16
- TypeScript
- Tailwind v4
- Framer Motion