► SHARED HOSTING ARCHITECTURE
┌─────────────────────────────────────────────────────────────────────┐
│ USER REQUEST (HTTPS) │
└─────────────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ HOSTINGER VPS (Frontend Layer) │
│ 93.127.217.187 │
├─────────────────────────────────────────────────────────────────────┤
│ NGINX → SSL Termination & Reverse Proxy │
│ │
│ Frontend Services (Static Files + Light Python): │
│ • Port 5027: WEBGPU27 Frontend │
│ • Port 5030: WEBGPU30 Frontend (gunicorn) │
│ • Port 6000-6006: Frontend app servers │
│ • /var/www/html/webgpuXX/: Static HTML/CSS/JS │
│ │
│ Role: Serve web pages, handle user sessions, route requests │
└────────────────────────┬────────────────────────────────────────────┘
│ API Calls for GPU/LLM Processing
│ (Frontend JS → Backend API)
▼
┌─────────────────────────────────────────────────────────────────────┐
│ DATABASEMART GPU (Backend Layer) │
│ 77.93.154.44 │
│ NVIDIA P1000 GPU │
├─────────────────────────────────────────────────────────────────────┤
│ Backend Services (GPU-Intensive Processing): │
│ │
│ SHARED SERVICES (Used by ALL WEBGPU apps): │
│ • Port 5000: Main LLM Service (WEBGPU0000) │
│ • Port 5003: Admin WEBGPU Service │
│ • Port 5004: Mobile WEBGPU (4 workers) │
│ • Port 6030: Shared LLM Service (uvicorn) │
│ • /opt/webgpu_shared_backend/: Shared backend │
│ • /opt/webgpu_llm_service/: Main LLM processor │
│ │
│ SPECIALIZED SERVICES: │
│ • Port 6000: Landing Page Service │
│ • Port 6033: Math Specialist │
│ • Port 6034: Instrumentation Specialist │
│ │
│ Role: GPU inference, model loading, vector search, RAG │
└─────────────────────────────────────────────────────────────────────┘
► FRONTEND SERVICES (Hostinger VPS)
LOADING...
► BACKEND SERVICES (DatabaseMart GPU)
LOADING...
► SHARED vs SPLIT BREAKDOWN
SHARED BACKEND SERVICES (All WEBGPU apps use these):
┌────────────────────────────────────────────────────────┐
│ Port 5000: Main LLM Engine │
│ Port 5003: Admin Control Panel │
│ Port 5004: Mobile API Gateway │
│ Port 6030: Shared LLM Service │
│ /opt/webgpu_shared_backend/: Common backend logic │
└────────────────────────────────────────────────────────┘
SPLIT FRONTEND SERVICES (Per-domain static files):
┌────────────────────────────────────────────────────────┐
│ Each WEBGPU domain has its own: │
│ • Static HTML/CSS/JS files │
│ • Nginx virtual host configuration │
│ • SSL certificate │
│ • Frontend Python app (light) │
└────────────────────────────────────────────────────────┘
SPECIALIZED BACKEND SERVICES (Domain-specific):
┌────────────────────────────────────────────────────────┐
│ Port 6033: Math calculations (astmai.com etc) │
│ Port 6034: Instrumentation data │
│ Port 6000: Landing pages │
└────────────────────────────────────────────────────────┘
► HOW THE SPLIT ARCHITECTURE WORKS
Step 1: User visits https://fbahistoryai.com
│
▼
Step 2: Hostinger Nginx serves static HTML/CSS/JS
│ (Fast - no GPU needed for page load)
▼
Step 3: User interacts with page, asks AI question
│ JavaScript makes API call
▼
Step 4: DatabaseMart GPU backend receives API request
│ Port 5000 (shared LLM service)
▼
Step 5: GPU processes the query
│ • Load model
│ • Vector search
│ • Generate response
▼
Step 6: Response sent back to frontend
│
▼
Step 7: JavaScript displays answer to user
KEY INSIGHT:
• Frontend (Hostinger) = Fast static file serving
• Backend (DatabaseMart) = Slow GPU processing
• Shared Services = Cost-effective (1 LLM serves 20 sites)
► REAL-TIME SERVICE STATUS
CHECKING...
AUTO-REFRESH: 10 SECONDS | PRESS [R] TO REFRESH |
[M] MODERN VIEW | [T] TERMINAL VIEW