Initial commit: Ebook Translation System with Docker setup

This commit is contained in:
richardtekula
2025-11-11 16:01:34 +01:00
commit e1b95c613d
43 changed files with 13922 additions and 0 deletions

768
SYSTEM_DOCUMENTATION.md Normal file
View File

@@ -0,0 +1,768 @@
# EBOOK TRANSLATION SYSTEM - COMPLETE SYSTEM DOCUMENTATION
**Version:** 1.0.0
**Document Type:** System Architecture & Technical Documentation
---
## TABLE OF CONTENTS
1. [Executive Summary](#1-executive-summary)
2. [System Overview](#2-system-overview)
3. [System Architecture](#3-system-architecture)
4. [System Workflows](#4-system-workflows)
5. [Security Features](#5-security-features)
6. [Technology Stack](#6-technology-stack)
7. [Deployment Architecture](#7-deployment-architecture)
---
## 1. EXECUTIVE SUMMARY
### 1.1 Project Purpose
The **Ebook Translation System** is an enterprise-grade web application designed to manage ebook translations through a Chrome extension and admin panel. The system consists of three main components:
1. **Admin Backend** - FastAPI-based REST API server
2. **Admin Dashboard** - Web-based management interface
3. **Chrome Extension** - Browser extension for automated ebook translation
### 1.2 System Capabilities
- **Coupon Code Management**: Generate, validate, and track coupon codes for access control
- **Translation File Management**: Upload, download, and manage Excel-based translation files
- **Automated Translation**: Browser extension that applies translations to ebooks automatically
- **Admin Dashboard**: Comprehensive interface for managing all system resources
- **Access Control**: Session-based authentication with admin privileges
- **Usage Tracking**: Monitor coupon usage and translation activities
---
## 2. SYSTEM OVERVIEW
### 2.1 High-Level Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ EBOOK TRANSLATION SYSTEM │
└─────────────────────────────────────────────────────────────────┘
┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ ADMIN PANEL │ │ BACKEND API │ │ CHROME │
│ (Frontend) │◄───────►│ (FastAPI) │◄───────►│ EXTENSION │
│ │ HTTP │ │ HTTP │ │
│ - Login │ │ - Auth Routes │ │ - Verification │
│ - Coupons │ │ - Coupon Mgmt │ │ - Translation │
│ - Translations │ │ - Translation │ │ - Excel Load │
└──────────────────┘ └────────┬─────────┘ └─────────────────┘
┌─────────────────┐
│ PostgreSQL │
│ Database │
│ │
│ - admin_users │
│ - coupon_codes │
└─────────────────┘
```
### 2.2 User Roles & Workflows
#### **Administrator Workflow:**
1. Login to Admin Dashboard
2. Generate coupon codes (single or bulk)
3. Upload coupon codes via Excel file
4. Monitor coupon usage
5. Manage translation files (upload/download/delete)
#### **End User Workflow:**
1. Install Chrome Extension
2. Enter coupon code (validates against backend)
3. Select target language
4. Extension downloads translation file from backend
5. Start automated translation on ebook pages
6. Extension applies translations based on Excel data
---
## 3. SYSTEM ARCHITECTURE
### 3.1 Component Architecture
#### **3.1.1 Admin Backend (FastAPI Application)**
**Location:** `/admin-backend/`
**Purpose:**
- Central API server handling all business logic
- Authentication and authorization
- Database operations
- File management
**Key Features:**
- RESTful API architecture
- Automatic database initialization
- Session-based authentication
- Request logging and monitoring
- Error handling middleware
- CORS support for frontend integration
**Core Files:**
- `main.py` - FastAPI application entry point
- `init_db.py` - Database initialization script
- `routes/auth.py` - All API endpoints
- `models/` - SQLAlchemy database models
- `utils/` - Helper functions and utilities
- `schemas.py` - Pydantic validation schemas
---
#### **3.1.2 Admin Frontend (Web Dashboard)**
**Location:** `/admin-frontend/`
**Purpose:**
- User interface for administrators
- Coupon management interface
- Translation file management
- System monitoring
**Key Features:**
- Modern responsive UI
- Real-time data updates
- Pagination for large datasets
- Search and filter functionality
- Excel file upload with validation
- Drag-and-drop file upload
**Core Files:**
- `admin_login.html` - Login page
- `admin_login.js` - Login logic
- `admin_dashboard.html` - Main dashboard UI
- `admin_dashboard.js` - Dashboard functionality
---
#### **3.1.3 Chrome Extension**
**Location:** `/extension/`
**Purpose:**
- Browser-based translation tool
- Automated ebook translation
- Access code verification
- Translation file consumption
**Key Features:**
- Modular service architecture
- Fuzzy text matching for translations
- Multi-language support
- Automatic page navigation
- Section highlighting
- Note addition to ebook pages
**Core Files:**
- `manifest.json` - Extension configuration
- `popup.html/popup.js` - Extension UI
- `config.js` - Configuration constants
- `authService.js` - Authentication logic
- `excelService.js` - Excel data management
- `translationService.js` - Translation orchestration
- `contentService.js` - DOM manipulation
- `pageService.js` - Page navigation
- `uiService.js` - UI management
- `eventHandlers.js` - Event management
---
### 3.2 Data Flow Architecture
#### **Scenario 1: Admin Uploads Translation File**
```
Admin Dashboard → Backend API → File System → Database Metadata
(Upload) (Validate) (Store) (Track)
```
**Step-by-Step:**
1. Admin selects Excel file in dashboard
2. Frontend sends file to `/upload-translations` endpoint
3. Backend validates file format and size
4. File saved to `/translationfile/translation.xlsx`
5. Original filename stored in `metadata.txt`
6. Success response returned to frontend
---
#### **Scenario 2: User Verifies Coupon Code**
```
Chrome Extension → Backend API → Database → Response
(Submit Code) (Validate) (Check) (Result)
↓ ↓
Save to Storage ←─────────────────────── Mark as Used
```
**Step-by-Step:**
1. User enters coupon code in extension
2. Extension sends POST to `/verify` endpoint
3. Backend queries `coupon_codes` table
4. If valid and unused, marks as used (usage_count++)
5. Timestamps recorded (Asia/Kolkata timezone)
6. Extension saves verification status locally
7. User can proceed to language selection
---
#### **Scenario 3: Translation Execution**
```
Extension → Backend API → Excel File → Translation Service
(Start) (Download) (Parse) (Apply to Page)
↓ ↓ ↓
Select Load Translation Find Best Highlight +
Language Data Match Add Note
```
**Step-by-Step:**
1. User selects target language
2. Extension downloads `/translations/latest`
3. Excel file parsed using SheetJS library
4. Extension identifies sections on ebook page
5. For each section:
- Extract text content
- Find translation using fuzzy matching
- Highlight section on page
- Add translated note
6. Automatically navigate to next page
7. Repeat until all pages processed
## 4. SYSTEM WORKFLOWS
### 4.1 Complete User Journey
#### **Phase 1: System Setup (Admin)**
```
Step 1: Admin Login
├── Navigate to http://localhost:8000/login
├── Enter credentials (admin/admin@123)
├── Click "Login"
└── Redirected to Dashboard
Step 2: Generate Coupon Codes
├── Click "Generate" tab
├── Select mode (Single or Bulk)
├── For Bulk: Enter count
├── Click "Generate Codes"
└── View generated codes
Step 3: Upload Translation File
├── Click "Translation Upload" tab
├── Select Excel file (.xlsx)
├── File contains columns: Original, Language1, Language2, etc.
├── Click "Upload"
└── Confirmation message displayed
```
---
#### **Phase 2: End User Experience**
```
Step 1: Install Extension
├── Load extension in Chrome
├── Open extension popup
└── See verification screen
Step 2: Verify Access Code
├── Enter coupon code
├── Click "Verify"
├── Extension calls /verify endpoint
├── If valid:
│ ├── Code marked as used in database
│ ├── Verification saved locally
│ └── Language selection screen shown
└── If invalid: Error message
Step 3: Select Language
├── Choose target language from dropdown
├── Language preference saved
└── Click "Start Translation"
Step 4: Translation Execution
├── Extension loads translation file
├── Parses Excel data
├── Identifies sections on ebook page
├── For each section:
│ ├── Extract text
│ ├── Find translation (fuzzy match)
│ ├── Highlight section
│ └── Add translated note
├── Navigate to next page
└── Repeat until complete
```
---
### 4.2 Technical Workflow Details
#### **Coupon Verification Workflow**
```
┌─────────────┐
│ User │
│ Enters │
│ Code │
└──────┬──────┘
┌─────────────────────────────────┐
│ Extension: authService.js │
│ ┌────────────────────────────┐ │
│ │ 1. Check if blocked │ │
│ │ 2. Normalize code │ │
│ │ 3. POST /verify │ │
│ └────────┬───────────────────┘ │
└───────────┼─────────────────────┘
┌─────────────────────────────────┐
│ Backend: routes/auth.py │
│ ┌────────────────────────────┐ │
│ │ 1. Extract code │ │
│ │ 2. Query database │ │
│ │ 3. Check usage_count │ │
│ │ 4. Increment usage │ │
│ │ 5. Set used_at timestamp │ │
│ │ 6. Return response │ │
│ └────────┬───────────────────┘ │
└───────────┼─────────────────────┘
┌─────────────────────────────────┐
│ PostgreSQL Database │
│ ┌────────────────────────────┐ │
│ │ UPDATE coupon_codes │ │
│ │ SET usage_count = 1, │ │
│ │ used_at = NOW() │ │
│ │ WHERE code = ? │ │
│ └────────────────────────────┘ │
└─────────────────────────────────┘
```
---
#### **Translation Execution Workflow**
```
┌─────────────────────────────────┐
│ Extension: translationService │
└──────────────┬──────────────────┘
┌───────────────┐
│ Load Excel │────┐
│ Data │ │
└───────┬───────┘ │
│ │
▼ ▼
┌───────────────┐ ┌──────────────┐
│ Get Active │ │ Parse Excel │
│ Tab │ │ with XLSX.js │
└───────┬───────┘ └──────┬───────┘
│ │
└────────┬────────┘
┌───────────────┐
│ Collect │
│ Sections │
│ from Page │
└───────┬───────┘
┌───────────────────────┐
│ FOR EACH SECTION: │
│ ┌───────────────────┐ │
│ │ 1. Select section │ │
│ │ 2. Extract text │ │
│ │ 3. Find match │ │
│ │ 4. Highlight │ │
│ │ 5. Add note │ │
│ └───────────────────┘ │
└───────────┬───────────┘
┌───────────────┐
│ Next Page? │
└───────┬───────┘
┌────────┴────────┐
▼ ▼
┌────────┐ ┌─────────┐
│ Yes │ │ No │
│ Repeat │ │Complete │
└────────┘ └─────────┘
```
---
## 5. SECURITY FEATURES
### 5.1 Authentication Security
**Password Security:**
- Bcrypt hashing (4.0.1)
- Salt rounds: Default (auto-generated)
- Timing-safe password comparison
- No plain-text password storage
**Session Security:**
- HTTP-only cookies (no JavaScript access)
- SameSite=Strict (CSRF protection)
- Secure flag in production (HTTPS only)
- Session-based (no JWT tokens in localStorage)
**Login Protection:**
- Rate limiting in extension (3 attempts)
- Time-based blocking (24 hours)
- Failed attempt tracking
- Block status persistence
---
### 5.2 API Security
**CORS Configuration:**
- Configurable allowed origins
- Credentials support
- Preflight handling
- Environment-based restrictions
**Input Validation:**
- Pydantic schema validation
- SQL injection prevention (ORM)
- File type validation
- Size limits (10MB for files)
- XSS prevention (no HTML rendering)
**Authorization:**
- Cookie-based auth check on protected routes
- 401 Unauthorized for invalid sessions
- Route-level authentication decorators
---
### 5.3 Data Security
**Database Security:**
- Parameterized queries (SQLAlchemy ORM)
- No raw SQL execution
- Transaction management
- Connection pooling
**File Upload Security:**
- Extension whitelist (.xlsx, .xls only)
- Size limits (10MB)
- Filename sanitization
- Overwrite prevention
- Isolated storage directory
**Coupon Code Security:**
- Case-insensitive comparison
- One-time use enforcement
- Usage tracking
- Duplicate prevention
---
### 5.4 Production Security Recommendations
**Must Implement:**
1. HTTPS/TLS encryption
2. Strong SECRET_KEY (32+ characters)
3. Change default admin password
4. Database SSL connections
**Environment Variables:**
```bash
DEBUG=false
ENVIRONMENT=production
SECRET_KEY=<generated-secure-key>
ADMIN_PASSWORD=<strong-password>
DATABASE_URL=postgresql://user:pass@host/db?sslmode=require
CORS_ORIGINS=https://yourdomain.com
```
---
## 6. TECHNOLOGY STACK
### 6.1 Backend Technologies
| Component | Technology | Version | Purpose |
|-----------|-----------|---------|---------|
| **Framework** | FastAPI | Latest | Web framework |
| **Server** | Uvicorn | Latest | ASGI server |
| **ORM** | SQLAlchemy | 2.x | Database ORM |
| **Database** | PostgreSQL | 12+ | Data storage |
| **Validation** | Pydantic | 2.x | Data validation |
| **Password** | Passlib + Bcrypt | 4.0.1 | Password hashing |
| **Testing** | Pytest | Latest | Unit testing |
| **HTTP Client** | HTTPx | Latest | Test client |
---
### 6.2 Frontend Technologies
| Component | Technology | Purpose |
|-----------|-----------|---------|
| **HTML** | HTML5 | Structure |
| **CSS** | CSS3 | Styling |
| **JavaScript** | Vanilla JS | Interactivity |
| **Icons** | Font Awesome | UI icons |
| **Excel** | SheetJS (XLSX) | Excel parsing |
---
### 6.3 Extension Technologies
| Component | Technology | Purpose |
|-----------|-----------|---------|
| **Manifest** | V3 | Extension config |
| **Storage** | Chrome Storage API | Data persistence |
| **Tabs** | Chrome Tabs API | Page interaction |
| **Scripting** | Chrome Scripting API | Content injection |
| **Excel** | SheetJS (XLSX) | Translation data |
| **Matching** | Fuzzysort | Fuzzy text matching |
| **Permissions** | activeTab, storage, scripting | Extension capabilities |
---
### 6.4 Development Tools
| Tool | Purpose |
|------|---------|
| **python-dotenv** | Environment management |
| **pytest-postgresql** | Test database |
| **pytz** | Timezone handling |
| **itsdangerous** | Secure signing |
| **python-multipart** | File upload handling |
---
## 7. DEPLOYMENT ARCHITECTURE
### 7.1 Development Environment
**Requirements:**
- Python 3.10+
- PostgreSQL 12+
- Virtual environment
- Node.js (for frontend builds - optional)
**Setup:**
```bash
# Clone repository
git clone <repo-url>
cd ebook_extension-feature-admin-dashboard
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with database credentials
# Initialize database
cd admin-backend
python init_db.py
# Start server
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
---
### 7.2 Production Deployment
**Server Requirements:**
- Linux server (Ubuntu 20.04+ recommended)
- Python 3.10+
- PostgreSQL 12+
- Nginx (reverse proxy)
- Systemd (process management)
- SSL certificates (Let's Encrypt)
**Deployment Steps:**
**1. Server Setup:**
```bash
# Install dependencies
sudo apt update
sudo apt install python3.10 python3-pip postgresql nginx certbot
# Create application user
sudo useradd -m -s /bin/bash ebook-app
```
**2. Application Deployment:**
```bash
# Clone repository
cd /var/www
sudo git clone <repo-url> ebook-app
sudo chown -R ebook-app:ebook-app ebook-app
# Setup virtual environment
cd ebook-app
sudo -u ebook-app python3 -m venv .venv
sudo -u ebook-app .venv/bin/pip install -r requirements.txt
# Configure environment
sudo -u ebook-app cp .env.example .env
# Edit .env with production values
```
**3. Database Setup:**
```bash
# Create database
sudo -u postgres createdb ebook_prod
sudo -u postgres psql -c "CREATE USER ebook_user WITH PASSWORD 'secure_password';"
sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE ebook_prod TO ebook_user;"
# Initialize
cd admin-backend
sudo -u ebook-app ../.venv/bin/python init_db.py
```
**4. Systemd Service:**
```ini
# /etc/systemd/system/ebook-api.service
[Unit]
Description=Ebook Translation API
After=network.target postgresql.service
[Service]
Type=notify
User=ebook-app
Group=ebook-app
WorkingDirectory=/var/www/ebook-app/admin-backend
Environment="PATH=/var/www/ebook-app/.venv/bin"
EnvironmentFile=/var/www/ebook-app/.env
ExecStart=/var/www/ebook-app/.venv/bin/gunicorn \
-w 4 \
-k uvicorn.workers.UvicornWorker \
--bind 127.0.0.1:8000 \
main:app
Restart=always
[Install]
WantedBy=multi-user.target
```
**5. Nginx Configuration:**
```nginx
# /etc/nginx/sites-available/ebook-api
upstream ebook_backend {
server 127.0.0.1:8000;
}
server {
listen 80;
server_name yourdomain.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name yourdomain.com;
ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
client_max_body_size 10M;
location / {
proxy_pass http://ebook_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
**6. Start Services:**
```bash
# Enable and start API
sudo systemctl enable ebook-api
sudo systemctl start ebook-api
# Enable and start Nginx
sudo ln -s /etc/nginx/sites-available/ebook-api /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx
# Get SSL certificate
sudo certbot --nginx -d yourdomain.com
```
---
### 7.3 Chrome Extension Deployment
**Development:**
1. Navigate to `chrome://extensions/`
2. Enable "Developer mode"
3. Click "Load unpacked"
4. Select `/extension` directory
**Production:**
1. Update `manifest.json` with production API URL
2. Create ZIP archive of extension directory
3. Upload to Chrome Web Store Developer Dashboard
4. Submit for review
**Configuration:**
```javascript
// extension/config.js
export const CONFIG = {
API_BASE: "https://yourdomain.com", // Production URL
// ... rest of config
};
```
---
### 7.4 Monitoring & Logging
**Application Logs:**
```bash
# View application logs
sudo journalctl -u ebook-api -f
# View error logs
tail -f /var/www/ebook-app/admin-backend/logs/error.log
# View access logs
tail -f /var/www/ebook-app/admin-backend/logs/app.log
```
**Health Monitoring:**
```bash
# Check API health
curl https://yourdomain.com/health
# Check service status
sudo systemctl status ebook-api
# Check database connection
sudo -u postgres psql -d ebook_prod -c "SELECT COUNT(*) FROM coupon_codes;"
```