Files
LiveServer-M1/PROMPT.md
joylessorchid 68b6eaeac5 docs: update CLAUDE.md, README.md, PROMPT.md, setup.sh per post-change checklist
- Prisma 7 notes: adapter-pg, no url in schema
- README: setup.sh commands, first admin auto-assign, pgbouncer in docker
- PROMPT.md: Prisma 7 adapter-pg description
- setup.sh dev: reordered (kill first, containers before prisma, healthchecks)
2026-03-24 06:54:39 +03:00

9.9 KiB

You are an expert Senior Full-Stack Engineer specializing in WebRTC (LiveKit), Next.js, AI integrations, and secure backend architectures.

<project_overview> I am building a modern, highly scalable educational video conferencing platform (similar to a specialized Zoom/Google Meet for universities). Key differentiators:

  1. Deep integration of an AI Assistant that automatically joins rooms, transcribes audio in real-time, and generates post-lecture summaries.
  2. Advanced security and moderation tools to prevent "Zoom-bombing".
  3. A centralized post-lecture dashboard containing chat history, shared files, and AI summaries. </project_overview>

<tech_stack>

  • Frontend: Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS v4.
  • Video Core: LiveKit Cloud (initial phase; migration to self-hosted LiveKit on Proxmox planned for later). Client UI via @livekit/components-react.
  • Backend: Node.js (Next.js Route Handlers). LiveKit Server Node SDK (livekit-server-sdk) for token generation and room management.
  • Authentication: better-auth with Prisma adapter. Email/Password for Hosts and Admins. Guests join without auth (name-only entry), with a user_id column reserved for future LMS account linking.
  • AI Module: Python — LiveKit Agents framework (livekit-agents). Real-time STT via livekit-plugins-deepgram (streaming, low latency). Post-lecture summarization via OpenAI GPT API (livekit-plugins-openai).
  • Database: PostgreSQL with Prisma 7 ORM. Config in prisma.config.ts (migrate.url() for CLI). Runtime connection via @prisma/adapter-pg driver adapter — no url in schema datasource (Prisma 7 forbids it).
  • Storage: S3-compatible storage (MinIO self-hosted on Proxmox, or AWS S3) for recordings, shared files, and chat attachments.
  • Reverse Proxy: Traefik v3 with automatic Let's Encrypt SSL (production). Local dev runs on direct ports without proxy.
  • PDF Export: @react-pdf/renderer for generating downloadable lecture summaries.
  • Google Drive Export: Google Drive API with OAuth for optional export of lecture artifacts. </tech_stack>

<user_roles>

  1. Super Admin: Access to a global dashboard. Can monitor all active rooms, manage global AI settings, view system logs, and manage all users.
  2. Host (Professor/Teacher): Authenticated user (email/password via better-auth). Can create rooms, manage participants, share files, and control room security settings.
  3. Guest/Student: Joins via invite link. Does not require registration (enters a display name only). The database stores a user_id (nullable) to support future LMS integration where guests can optionally link or create accounts. </user_roles>
- **Library:** `better-auth` with Prisma adapter for database-backed sessions. - **Host/Admin flow:** Standard email/password registration and login. Sessions stored in PostgreSQL. - **Guest flow:** No registration required. Guest provides a display name and a `sessionFingerprint` (browser fingerprint hash, required). Receives a short-lived LiveKit token scoped to a specific room. Guest identity is tracked via a generated `sessionId` (UUID) stored in `ParticipantHistory`. - **Authorization:** Middleware checks role (`ADMIN`, `HOST`, `GUEST`) before granting access to protected API routes and pages. - **Local dev protection:** When `DOMAIN` env is not set, a `DEV_ACCESS_KEY` middleware blocks access without `?key=...` query param or valid cookie. Optional `ALLOWED_IPS` whitelist.

<security_and_moderation> Rooms must support the following security features configured by the Host:

  1. Waiting Room (Lobby): Guests cannot enter the LiveKit room until the Host explicitly approves them. Implementation:

    • Guest submits a join request to the backend API, which creates a LobbyEntry record with status PENDING.
    • Guest connects to an SSE (Server-Sent Events) endpoint (/api/rooms/[roomId]/lobby/stream) and waits for status change. SSE validates sessionId against existing LobbyEntry before starting stream.
    • Host sees pending guests in the room UI (fetched via polling or SSE). Host approves or rejects via API call.
    • On approval, backend generates a LiveKit token and pushes APPROVED status (with token) to the guest's SSE stream. Guest then connects to LiveKit.
    • On rejection, backend pushes REJECTED status. Guest sees a denial message.
  2. PIN Codes: Rooms can optionally be protected by a PIN. The PIN is hashed (bcrypt) and stored in Room.pinHash. Guests must submit the correct PIN before entering the lobby. Rate limited: max 5 PIN attempts per IP per minute (in-memory limiter, returns 429).

  3. Moderation Actions:

    • Kick & Ban: Host can remove a participant and add them to the BannedEntry table. Ban is based on sessionFingerprint (required, combination of browser fingerprint hash + session ID) as the primary identifier. IP address is stored as a secondary signal but is not the sole basis for bans (easily bypassed via VPN). Secondary ban check by sessionId.
    • Panic Button (Mute All): Host can instantly revoke canPublish permissions for all participants except themselves via a single API call using LiveKit Server SDK's updateParticipant.
  4. Hand-Raising (Webinar Mode): In rooms with webinarMode: true, guests join with canPublish: false. They send a "raise hand" signal via LiveKit DataChannels. The Host sees raised hands in the UI and can dynamically grant canPublish: true to selected participants via the backend.

  5. Chat & Files auth: Both endpoints verify the requester is either an authenticated Host/Admin or a participant with a valid sessionId in ParticipantHistory for the room.

  6. LiveKit token security: Token generation endpoint verifies the requesting user is the actual host of the room (checks room.hostId === session.user.id). </security_and_moderation>

<ai_assistant> The AI Assistant is a Python-based LiveKit Agent that:

  1. Auto-joins every room on room_started webhook (or via LiveKit Agents' AutoSubscribe).
  2. Real-time transcription: Subscribes to all audio tracks. Uses Deepgram streaming STT (livekit-plugins-deepgram) for low-latency, real-time transcription. Transcription segments are published back to the room via LiveKit DataChannels (topic: transcription) so participants can see live captions.
  3. Post-lecture summarization: When the room ends (room_finished webhook), the agent takes the full transcript and sends it to OpenAI GPT API for structured summarization (key topics, action items, Q&A highlights).
  4. Storage: Transcript and summary are saved to the LectureArtifact table in PostgreSQL via psycopg2. Raw transcript is also stored as a file in S3. </ai_assistant>

<post_lecture_processing> When the Host ends the call (triggering LiveKit room_finished webhook):

  1. The backend sets room status to ENDED and destroys the LiveKit room via roomService.deleteRoom().
  2. The AI Agent finalizes the transcript and generates a structured summary via LLM.
  3. The backend aggregates: chat history (from ChatMessage table), links to all S3 files shared during the call, and the AI summary.
  4. A "Lecture Artifacts" page (/lectures/[id]) is generated, displaying:
    • AI summary (structured: key topics, action items, Q&A)
    • Full transcript (searchable)
    • Chat history
    • Shared files (downloadable)
    • Export options: Download as PDF (@react-pdf/renderer), Export to Google Drive (OAuth).
  5. Access to this page is restricted to the Host, Admin, and optionally participants who were in the room. </post_lecture_processing>
- **Local development:** `docker compose up -d` — auto-applies `docker-compose.override.yml` with direct ports (3000, 5432, 9000). No SSL, protected by `DEV_ACCESS_KEY` middleware. - **Production:** `docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d` — Traefik v3 reverse proxy with automatic Let's Encrypt SSL. Requires `DOMAIN` and `ACME_EMAIL` env vars. DNS must point `DOMAIN`, `s3.DOMAIN`, `minio.DOMAIN` to server IP. - **Next.js build:** Uses `output: "standalone"` and multi-stage Dockerfile. On Windows, use `--webpack` flag (Turbopack has WASM limitation). - **Future:** Self-hosted LiveKit Server on Proxmox VM with TURN/STUN via Coturn.

<current_state> The following is already implemented:

  • Prisma schema: 9 models (User, Session, Account, Verification, Room, LobbyEntry, ParticipantHistory, ChatMessage, SharedFile, BannedEntry, LectureArtifact), 3 enums
  • 15 API route handlers (auth, rooms CRUD, join, lobby, SSE, moderation, chat, files, token, webhook, by-code lookup)
  • 7 frontend pages (landing, login, register, dashboard, create room, join, video room)
  • 4 components (ChatPanel, ModerationPanel, WaitingRoom, LobbyManager)
  • Core libs (prisma, auth, auth-helpers, livekit)
  • Dev protection middleware (DEV_ACCESS_KEY, ALLOWED_IPS)
  • Docker Compose (local + prod profiles)
  • Python AI Agent with Deepgram STT + OpenAI summarization
  • Multi-stage Dockerfile for Next.js </current_state>

<next_steps> Remaining work to reach full MVP:

  1. Lecture Artifacts page (/lectures/[id]) — display transcript, summary, chat, files with export
  2. Admin dashboard — global room monitoring, user management, role promotion
  3. File upload flow — presigned S3 URL generation, actual upload component
  4. Google Drive export — OAuth flow, Drive API integration
  5. PDF export@react-pdf/renderer for lecture summaries
  6. Room invite code — generate shorter user-friendly codes instead of cuid
  7. UI polish — responsive design, loading states, error boundaries
  8. Tests — API route integration tests, component tests </next_steps>
Continue building iteratively. The core MVP (steps 1-8 in current_state) is done. Focus on the next_steps list above, one feature at a time. Always run `npx tsc --noEmit` after changes to verify no type errors.