- Prisma 7 notes: adapter-pg, no url in schema - README: setup.sh commands, first admin auto-assign, pgbouncer in docker - PROMPT.md: Prisma 7 adapter-pg description - setup.sh dev: reordered (kill first, containers before prisma, healthchecks)
9.9 KiB
<project_overview> I am building a modern, highly scalable educational video conferencing platform (similar to a specialized Zoom/Google Meet for universities). Key differentiators:
- Deep integration of an AI Assistant that automatically joins rooms, transcribes audio in real-time, and generates post-lecture summaries.
- Advanced security and moderation tools to prevent "Zoom-bombing".
- A centralized post-lecture dashboard containing chat history, shared files, and AI summaries. </project_overview>
<tech_stack>
- Frontend: Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS v4.
- Video Core: LiveKit Cloud (initial phase; migration to self-hosted LiveKit on Proxmox planned for later). Client UI via
@livekit/components-react. - Backend: Node.js (Next.js Route Handlers). LiveKit Server Node SDK (
livekit-server-sdk) for token generation and room management. - Authentication:
better-authwith Prisma adapter. Email/Password for Hosts and Admins. Guests join without auth (name-only entry), with auser_idcolumn reserved for future LMS account linking. - AI Module: Python — LiveKit Agents framework (
livekit-agents). Real-time STT vialivekit-plugins-deepgram(streaming, low latency). Post-lecture summarization via OpenAI GPT API (livekit-plugins-openai). - Database: PostgreSQL with Prisma 7 ORM. Config in
prisma.config.ts(migrate.url()for CLI). Runtime connection via@prisma/adapter-pgdriver adapter — nourlin schema datasource (Prisma 7 forbids it). - Storage: S3-compatible storage (MinIO self-hosted on Proxmox, or AWS S3) for recordings, shared files, and chat attachments.
- Reverse Proxy: Traefik v3 with automatic Let's Encrypt SSL (production). Local dev runs on direct ports without proxy.
- PDF Export:
@react-pdf/rendererfor generating downloadable lecture summaries. - Google Drive Export: Google Drive API with OAuth for optional export of lecture artifacts. </tech_stack>
<user_roles>
- Super Admin: Access to a global dashboard. Can monitor all active rooms, manage global AI settings, view system logs, and manage all users.
- Host (Professor/Teacher): Authenticated user (email/password via
better-auth). Can create rooms, manage participants, share files, and control room security settings. - Guest/Student: Joins via invite link. Does not require registration (enters a display name only). The database stores a
user_id(nullable) to support future LMS integration where guests can optionally link or create accounts. </user_roles>
<security_and_moderation> Rooms must support the following security features configured by the Host:
-
Waiting Room (Lobby): Guests cannot enter the LiveKit room until the Host explicitly approves them. Implementation:
- Guest submits a join request to the backend API, which creates a
LobbyEntryrecord with statusPENDING. - Guest connects to an SSE (Server-Sent Events) endpoint (
/api/rooms/[roomId]/lobby/stream) and waits for status change. SSE validatessessionIdagainst existingLobbyEntrybefore starting stream. - Host sees pending guests in the room UI (fetched via polling or SSE). Host approves or rejects via API call.
- On approval, backend generates a LiveKit token and pushes
APPROVEDstatus (with token) to the guest's SSE stream. Guest then connects to LiveKit. - On rejection, backend pushes
REJECTEDstatus. Guest sees a denial message.
- Guest submits a join request to the backend API, which creates a
-
PIN Codes: Rooms can optionally be protected by a PIN. The PIN is hashed (bcrypt) and stored in
Room.pinHash. Guests must submit the correct PIN before entering the lobby. Rate limited: max 5 PIN attempts per IP per minute (in-memory limiter, returns 429). -
Moderation Actions:
- Kick & Ban: Host can remove a participant and add them to the
BannedEntrytable. Ban is based onsessionFingerprint(required, combination of browser fingerprint hash + session ID) as the primary identifier. IP address is stored as a secondary signal but is not the sole basis for bans (easily bypassed via VPN). Secondary ban check bysessionId. - Panic Button (Mute All): Host can instantly revoke
canPublishpermissions for all participants except themselves via a single API call using LiveKit Server SDK'supdateParticipant.
- Kick & Ban: Host can remove a participant and add them to the
-
Hand-Raising (Webinar Mode): In rooms with
webinarMode: true, guests join withcanPublish: false. They send a "raise hand" signal via LiveKit DataChannels. The Host sees raised hands in the UI and can dynamically grantcanPublish: trueto selected participants via the backend. -
Chat & Files auth: Both endpoints verify the requester is either an authenticated Host/Admin or a participant with a valid
sessionIdinParticipantHistoryfor the room. -
LiveKit token security: Token generation endpoint verifies the requesting user is the actual host of the room (checks
room.hostId === session.user.id). </security_and_moderation>
<ai_assistant> The AI Assistant is a Python-based LiveKit Agent that:
- Auto-joins every room on
room_startedwebhook (or via LiveKit Agents'AutoSubscribe). - Real-time transcription: Subscribes to all audio tracks. Uses Deepgram streaming STT (
livekit-plugins-deepgram) for low-latency, real-time transcription. Transcription segments are published back to the room via LiveKit DataChannels (topic:transcription) so participants can see live captions. - Post-lecture summarization: When the room ends (
room_finishedwebhook), the agent takes the full transcript and sends it to OpenAI GPT API for structured summarization (key topics, action items, Q&A highlights). - Storage: Transcript and summary are saved to the
LectureArtifacttable in PostgreSQL via psycopg2. Raw transcript is also stored as a file in S3. </ai_assistant>
<post_lecture_processing>
When the Host ends the call (triggering LiveKit room_finished webhook):
- The backend sets room status to ENDED and destroys the LiveKit room via
roomService.deleteRoom(). - The AI Agent finalizes the transcript and generates a structured summary via LLM.
- The backend aggregates: chat history (from
ChatMessagetable), links to all S3 files shared during the call, and the AI summary. - A "Lecture Artifacts" page (
/lectures/[id]) is generated, displaying:- AI summary (structured: key topics, action items, Q&A)
- Full transcript (searchable)
- Chat history
- Shared files (downloadable)
- Export options: Download as PDF (
@react-pdf/renderer), Export to Google Drive (OAuth).
- Access to this page is restricted to the Host, Admin, and optionally participants who were in the room. </post_lecture_processing>
<current_state> The following is already implemented:
- Prisma schema: 9 models (User, Session, Account, Verification, Room, LobbyEntry, ParticipantHistory, ChatMessage, SharedFile, BannedEntry, LectureArtifact), 3 enums
- 15 API route handlers (auth, rooms CRUD, join, lobby, SSE, moderation, chat, files, token, webhook, by-code lookup)
- 7 frontend pages (landing, login, register, dashboard, create room, join, video room)
- 4 components (ChatPanel, ModerationPanel, WaitingRoom, LobbyManager)
- Core libs (prisma, auth, auth-helpers, livekit)
- Dev protection middleware (DEV_ACCESS_KEY, ALLOWED_IPS)
- Docker Compose (local + prod profiles)
- Python AI Agent with Deepgram STT + OpenAI summarization
- Multi-stage Dockerfile for Next.js </current_state>
<next_steps> Remaining work to reach full MVP:
- Lecture Artifacts page (
/lectures/[id]) — display transcript, summary, chat, files with export - Admin dashboard — global room monitoring, user management, role promotion
- File upload flow — presigned S3 URL generation, actual upload component
- Google Drive export — OAuth flow, Drive API integration
- PDF export —
@react-pdf/rendererfor lecture summaries - Room invite code — generate shorter user-friendly codes instead of cuid
- UI polish — responsive design, loading states, error boundaries
- Tests — API route integration tests, component tests </next_steps>