- Prisma 7 notes: adapter-pg, no url in schema - README: setup.sh commands, first admin auto-assign, pgbouncer in docker - PROMPT.md: Prisma 7 adapter-pg description - setup.sh dev: reordered (kill first, containers before prisma, healthchecks)
120 lines
9.9 KiB
Markdown
120 lines
9.9 KiB
Markdown
<role>
|
|
You are an expert Senior Full-Stack Engineer specializing in WebRTC (LiveKit), Next.js, AI integrations, and secure backend architectures.
|
|
</role>
|
|
|
|
<project_overview>
|
|
I am building a modern, highly scalable educational video conferencing platform (similar to a specialized Zoom/Google Meet for universities).
|
|
Key differentiators:
|
|
1. Deep integration of an AI Assistant that automatically joins rooms, transcribes audio in real-time, and generates post-lecture summaries.
|
|
2. Advanced security and moderation tools to prevent "Zoom-bombing".
|
|
3. A centralized post-lecture dashboard containing chat history, shared files, and AI summaries.
|
|
</project_overview>
|
|
|
|
<tech_stack>
|
|
- **Frontend:** Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS v4.
|
|
- **Video Core:** LiveKit Cloud (initial phase; migration to self-hosted LiveKit on Proxmox planned for later). Client UI via `@livekit/components-react`.
|
|
- **Backend:** Node.js (Next.js Route Handlers). LiveKit Server Node SDK (`livekit-server-sdk`) for token generation and room management.
|
|
- **Authentication:** `better-auth` with Prisma adapter. Email/Password for Hosts and Admins. Guests join without auth (name-only entry), with a `user_id` column reserved for future LMS account linking.
|
|
- **AI Module:** Python — LiveKit Agents framework (`livekit-agents`). Real-time STT via `livekit-plugins-deepgram` (streaming, low latency). Post-lecture summarization via OpenAI GPT API (`livekit-plugins-openai`).
|
|
- **Database:** PostgreSQL with Prisma 7 ORM. Config in `prisma.config.ts` (`migrate.url()` for CLI). Runtime connection via `@prisma/adapter-pg` driver adapter — no `url` in schema datasource (Prisma 7 forbids it).
|
|
- **Storage:** S3-compatible storage (MinIO self-hosted on Proxmox, or AWS S3) for recordings, shared files, and chat attachments.
|
|
- **Reverse Proxy:** Traefik v3 with automatic Let's Encrypt SSL (production). Local dev runs on direct ports without proxy.
|
|
- **PDF Export:** `@react-pdf/renderer` for generating downloadable lecture summaries.
|
|
- **Google Drive Export:** Google Drive API with OAuth for optional export of lecture artifacts.
|
|
</tech_stack>
|
|
|
|
<user_roles>
|
|
1. **Super Admin:** Access to a global dashboard. Can monitor all active rooms, manage global AI settings, view system logs, and manage all users.
|
|
2. **Host (Professor/Teacher):** Authenticated user (email/password via `better-auth`). Can create rooms, manage participants, share files, and control room security settings.
|
|
3. **Guest/Student:** Joins via invite link. Does not require registration (enters a display name only). The database stores a `user_id` (nullable) to support future LMS integration where guests can optionally link or create accounts.
|
|
</user_roles>
|
|
|
|
<authentication>
|
|
- **Library:** `better-auth` with Prisma adapter for database-backed sessions.
|
|
- **Host/Admin flow:** Standard email/password registration and login. Sessions stored in PostgreSQL.
|
|
- **Guest flow:** No registration required. Guest provides a display name and a `sessionFingerprint` (browser fingerprint hash, required). Receives a short-lived LiveKit token scoped to a specific room. Guest identity is tracked via a generated `sessionId` (UUID) stored in `ParticipantHistory`.
|
|
- **Authorization:** Middleware checks role (`ADMIN`, `HOST`, `GUEST`) before granting access to protected API routes and pages.
|
|
- **Local dev protection:** When `DOMAIN` env is not set, a `DEV_ACCESS_KEY` middleware blocks access without `?key=...` query param or valid cookie. Optional `ALLOWED_IPS` whitelist.
|
|
</authentication>
|
|
|
|
<security_and_moderation>
|
|
Rooms must support the following security features configured by the Host:
|
|
|
|
1. **Waiting Room (Lobby):** Guests cannot enter the LiveKit room until the Host explicitly approves them. Implementation:
|
|
- Guest submits a join request to the backend API, which creates a `LobbyEntry` record with status `PENDING`.
|
|
- Guest connects to an SSE (Server-Sent Events) endpoint (`/api/rooms/[roomId]/lobby/stream`) and waits for status change. SSE validates `sessionId` against existing `LobbyEntry` before starting stream.
|
|
- Host sees pending guests in the room UI (fetched via polling or SSE). Host approves or rejects via API call.
|
|
- On approval, backend generates a LiveKit token and pushes `APPROVED` status (with token) to the guest's SSE stream. Guest then connects to LiveKit.
|
|
- On rejection, backend pushes `REJECTED` status. Guest sees a denial message.
|
|
|
|
2. **PIN Codes:** Rooms can optionally be protected by a PIN. The PIN is hashed (bcrypt) and stored in `Room.pinHash`. Guests must submit the correct PIN before entering the lobby. **Rate limited:** max 5 PIN attempts per IP per minute (in-memory limiter, returns 429).
|
|
|
|
3. **Moderation Actions:**
|
|
- **Kick & Ban:** Host can remove a participant and add them to the `BannedEntry` table. Ban is based on `sessionFingerprint` (required, combination of browser fingerprint hash + session ID) as the primary identifier. IP address is stored as a secondary signal but is not the sole basis for bans (easily bypassed via VPN). Secondary ban check by `sessionId`.
|
|
- **Panic Button (Mute All):** Host can instantly revoke `canPublish` permissions for all participants except themselves via a single API call using LiveKit Server SDK's `updateParticipant`.
|
|
|
|
4. **Hand-Raising (Webinar Mode):** In rooms with `webinarMode: true`, guests join with `canPublish: false`. They send a "raise hand" signal via LiveKit DataChannels. The Host sees raised hands in the UI and can dynamically grant `canPublish: true` to selected participants via the backend.
|
|
|
|
5. **Chat & Files auth:** Both endpoints verify the requester is either an authenticated Host/Admin or a participant with a valid `sessionId` in `ParticipantHistory` for the room.
|
|
|
|
6. **LiveKit token security:** Token generation endpoint verifies the requesting user is the actual host of the room (checks `room.hostId === session.user.id`).
|
|
</security_and_moderation>
|
|
|
|
<ai_assistant>
|
|
The AI Assistant is a Python-based LiveKit Agent that:
|
|
1. **Auto-joins** every room on `room_started` webhook (or via LiveKit Agents' `AutoSubscribe`).
|
|
2. **Real-time transcription:** Subscribes to all audio tracks. Uses Deepgram streaming STT (`livekit-plugins-deepgram`) for low-latency, real-time transcription. Transcription segments are published back to the room via LiveKit DataChannels (topic: `transcription`) so participants can see live captions.
|
|
3. **Post-lecture summarization:** When the room ends (`room_finished` webhook), the agent takes the full transcript and sends it to OpenAI GPT API for structured summarization (key topics, action items, Q&A highlights).
|
|
4. **Storage:** Transcript and summary are saved to the `LectureArtifact` table in PostgreSQL via psycopg2. Raw transcript is also stored as a file in S3.
|
|
</ai_assistant>
|
|
|
|
<post_lecture_processing>
|
|
When the Host ends the call (triggering LiveKit `room_finished` webhook):
|
|
1. The backend sets room status to ENDED and destroys the LiveKit room via `roomService.deleteRoom()`.
|
|
2. The AI Agent finalizes the transcript and generates a structured summary via LLM.
|
|
3. The backend aggregates: chat history (from `ChatMessage` table), links to all S3 files shared during the call, and the AI summary.
|
|
4. A "Lecture Artifacts" page (`/lectures/[id]`) is generated, displaying:
|
|
- AI summary (structured: key topics, action items, Q&A)
|
|
- Full transcript (searchable)
|
|
- Chat history
|
|
- Shared files (downloadable)
|
|
- Export options: Download as PDF (`@react-pdf/renderer`), Export to Google Drive (OAuth).
|
|
5. Access to this page is restricted to the Host, Admin, and optionally participants who were in the room.
|
|
</post_lecture_processing>
|
|
|
|
<deployment>
|
|
- **Local development:** `docker compose up -d` — auto-applies `docker-compose.override.yml` with direct ports (3000, 5432, 9000). No SSL, protected by `DEV_ACCESS_KEY` middleware.
|
|
- **Production:** `docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d` — Traefik v3 reverse proxy with automatic Let's Encrypt SSL. Requires `DOMAIN` and `ACME_EMAIL` env vars. DNS must point `DOMAIN`, `s3.DOMAIN`, `minio.DOMAIN` to server IP.
|
|
- **Next.js build:** Uses `output: "standalone"` and multi-stage Dockerfile. On Windows, use `--webpack` flag (Turbopack has WASM limitation).
|
|
- **Future:** Self-hosted LiveKit Server on Proxmox VM with TURN/STUN via Coturn.
|
|
</deployment>
|
|
|
|
<current_state>
|
|
The following is already implemented:
|
|
- Prisma schema: 9 models (User, Session, Account, Verification, Room, LobbyEntry, ParticipantHistory, ChatMessage, SharedFile, BannedEntry, LectureArtifact), 3 enums
|
|
- 15 API route handlers (auth, rooms CRUD, join, lobby, SSE, moderation, chat, files, token, webhook, by-code lookup)
|
|
- 7 frontend pages (landing, login, register, dashboard, create room, join, video room)
|
|
- 4 components (ChatPanel, ModerationPanel, WaitingRoom, LobbyManager)
|
|
- Core libs (prisma, auth, auth-helpers, livekit)
|
|
- Dev protection middleware (DEV_ACCESS_KEY, ALLOWED_IPS)
|
|
- Docker Compose (local + prod profiles)
|
|
- Python AI Agent with Deepgram STT + OpenAI summarization
|
|
- Multi-stage Dockerfile for Next.js
|
|
</current_state>
|
|
|
|
<next_steps>
|
|
Remaining work to reach full MVP:
|
|
1. **Lecture Artifacts page** (`/lectures/[id]`) — display transcript, summary, chat, files with export
|
|
2. **Admin dashboard** — global room monitoring, user management, role promotion
|
|
3. **File upload flow** — presigned S3 URL generation, actual upload component
|
|
4. **Google Drive export** — OAuth flow, Drive API integration
|
|
5. **PDF export** — `@react-pdf/renderer` for lecture summaries
|
|
6. **Room invite code** — generate shorter user-friendly codes instead of cuid
|
|
7. **UI polish** — responsive design, loading states, error boundaries
|
|
8. **Tests** — API route integration tests, component tests
|
|
</next_steps>
|
|
|
|
<instructions>
|
|
Continue building iteratively. The core MVP (steps 1-8 in current_state) is done. Focus on the next_steps list above, one feature at a time. Always run `npx tsc --noEmit` after changes to verify no type errors.
|
|
</instructions>
|