Skip to content

Conversation

@RafiulM
Copy link
Collaborator

@RafiulM RafiulM commented Nov 2, 2025

Summary

This PR converts the repository to a monorepo structure with Turborepo and adds a comprehensive Express.js worker service for document processing with AI-powered embeddings and vector search.

Key Features

  • Monorepo Structure: Converted to Turborepo with pnpm workspace
  • Express.js Worker Service: Background document processing with BullMQ
  • AI Document Processing: PDF and Markdown parsing with OpenAI embeddings
  • Vector Search: pgvector-powered semantic similarity search
  • File Upload Interface: Drag-and-drop upload with real-time progress tracking
  • Document Management: Complete CRUD operations for processed documents

Changes

  • Repository Structure:

    • apps/web/ - Next.js application (moved from root)
    • apps/worker/ - New Express.js worker service
    • packages/db/ - Shared Drizzle ORM schema and migrations
    • packages/types/ - Shared TypeScript types
    • packages/tsconfig/ - Shared TypeScript configurations
  • New Services:

    • BullMQ queue system with Redis for background job processing
    • Document processing pipeline with PDF/Markdown parsing
    • Vector embeddings using OpenAI's text-embedding-3-small
    • Semantic similarity search with pgvector
    • REST API for file upload, job status, and document management
  • Frontend Components:

    • File upload component with drag-and-drop and progress tracking
    • Document search interface with real-time results
    • Job status polling and user feedback
    • Document management page with upload and search tabs
  • Infrastructure:

    • Docker Compose with Redis and pgvector-enabled PostgreSQL
    • Environment configuration for all services
    • Proper error handling and logging throughout

Test Plan

  • Set up environment variables (OpenAI API key, Redis, PostgreSQL)
  • Start services with docker compose up -d
  • Install dependencies with pnpm install
  • Start development servers with pnpm dev
  • Test file upload functionality
  • Verify document processing completes successfully
  • Test semantic search functionality
  • Verify job status polling and real-time updates
  • Test document management operations (view, delete)

Technical Details

  • Uses Turborepo for efficient monorepo builds
  • BullMQ for reliable queue processing with Redis
  • pgvector for high-performance vector similarity search
  • OpenAI embeddings for text understanding
  • React Dropzone for intuitive file uploads
  • Real-time progress tracking with WebSocket-style polling

🤖 Generated with Claude Code

Codespace Runner and others added 2 commits November 2, 2025 10:51
… processing

- Convert repository to Turborepo monorepo structure with pnpm workspace
- Create apps/web (Next.js) and apps/worker (Express.js) services
- Add packages/db (shared Drizzle ORM schema) and packages/types (shared TypeScript types)
- Add packages/tsconfig for shared TypeScript configurations
- Implement Express.js worker service with BullMQ queue system and Redis
- Build comprehensive document processing pipeline with AI SDK embeddings
- Add pgvector support for vector similarity search in PostgreSQL
- Create file upload API and frontend components with drag-and-drop
- Implement semantic search interface with real-time job status tracking
- Add Docker Compose configuration with Redis and pgvector-enabled PostgreSQL
- Create comprehensive API routes for document management and search
- Update navigation and create dedicated Documents page
- Add proper error handling, progress tracking, and user feedback

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…e base. Fixed some migration and embedding calls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants