Configuration & Deployment
Complete guide to configuring and deploying ExtendedLM
Environment Variables
ExtendedLM uses environment variables for configuration. Create a .env.local file in the project root:
Required Variables
# .env.local
# ================
# Database
# ================
DATABASE_URL="postgresql://user:password@localhost:5432/extendedlm"
DIRECT_URL="postgresql://user:password@localhost:5432/extendedlm"
# ================
# Supabase
# ================
NEXT_PUBLIC_SUPABASE_URL="https://your-project.supabase.co"
NEXT_PUBLIC_SUPABASE_ANON_KEY="your-anon-key"
SUPABASE_SERVICE_ROLE_KEY="your-service-role-key"
# ================
# OpenAI API
# ================
OPENAI_API_KEY="sk-..."
OPENAI_BASE_URL="http://localhost:8080/v1" # Optional: Gateway URL
# ================
# Next.js
# ================
NEXT_PUBLIC_APP_URL="http://localhost:3000"
NEXTAUTH_URL="http://localhost:3000"
NEXTAUTH_SECRET="generate-a-random-secret"
Optional Variables
# ================
# AI Providers (Optional)
# ================
ANTHROPIC_API_KEY="sk-ant-..."
GOOGLE_AI_API_KEY="..."
XAI_API_KEY="xai-..."
# ================
# Gateway Configuration
# ================
GATEWAY_URL="http://localhost:8080"
GATEWAY_API_KEY="your-gateway-api-key"
GATEWAY_LLAMA_CPP_HOST="http://localhost:8081"
# ================
# Vector Databases
# ================
# PostgreSQL with pgvector (already configured via DATABASE_URL)
VALKEY_URL="redis://localhost:6379"
VALKEY_VECTOR_INDEX="extendedlm-vectors"
# ================
# Mate (Computer Use)
# ================
MATE_SERVER_URL="http://localhost:8000"
MATE_SERVER_API_KEY="your-mate-key"
MATE_SERVER_DOCKER_NETWORK="extendedlm-network"
# ================
# MCP Servers
# ================
MCP_CAL2PROMPT_ENABLED="true"
MCP_SWITCHBOT_ENABLED="true"
MCP_SWITCHBOT_TOKEN="your-switchbot-token"
MCP_SWITCHBOT_SECRET="your-switchbot-secret"
# ================
# Third-Party APIs
# ================
OPENWEATHER_API_KEY="your-openweather-key"
GOOGLE_MAPS_API_KEY="your-google-maps-key"
# ================
# Storage
# ================
NEXT_PUBLIC_SUPABASE_STORAGE_BUCKET="uploads"
MAX_UPLOAD_SIZE="10485760" # 10MB in bytes
# ================
# Email (Optional)
# ================
SMTP_HOST="smtp.gmail.com"
SMTP_PORT="587"
SMTP_USER="your-email@gmail.com"
SMTP_PASSWORD="your-app-password"
EMAIL_FROM="noreply@extendedlm.com"
# ================
# Analytics (Optional)
# ================
NEXT_PUBLIC_GOOGLE_ANALYTICS_ID="G-XXXXXXXXXX"
NEXT_PUBLIC_POSTHOG_KEY="phc_..."
NEXT_PUBLIC_POSTHOG_HOST="https://app.posthog.com"
# ================
# Feature Flags
# ================
NEXT_PUBLIC_ENABLE_WORKFLOWS="true"
NEXT_PUBLIC_ENABLE_COMPUTER_USE="true"
NEXT_PUBLIC_ENABLE_MCP="true"
NEXT_PUBLIC_ENABLE_GRAPH_RAG="true"
NEXT_PUBLIC_ENABLE_RAPTOR="true"
# ================
# Rate Limiting
# ================
RATE_LIMIT_WINDOW="60000" # 1 minute in ms
RATE_LIMIT_MAX_REQUESTS="60" # 60 requests per minute
# ================
# Security
# ================
ALLOWED_ORIGINS="http://localhost:3000,https://yourdomain.com"
CSRF_SECRET="generate-a-random-secret"
Generating Secrets
# Generate random secrets
openssl rand -base64 32
# Or using Node.js
node -e "console.log(require('crypto').randomBytes(32).toString('base64'))"
Model Configuration
Configure available models and their parameters.
Model Definitions
export const models = [ // OpenAI Models { id: 'gpt-4o', name: 'GPT-4o', provider: 'openai', contextWindow: 128000, maxOutput: 16384, pricing: { input: 2.50, // per 1M tokens output: 10.00, // per 1M tokens }, capabilities: ['chat', 'vision', 'tools', 'streaming'], default: true, }, { id: 'gpt-5-preview', name: 'GPT-5 Preview', provider: 'openai', contextWindow: 1000000, maxOutput: 65536, pricing: { input: 5.00, output: 15.00 }, capabilities: ['chat', 'vision', 'tools', 'streaming'], }, // Anthropic Models { id: 'claude-opus-4-20250514', name: 'Claude Opus 4', provider: 'anthropic', contextWindow: 200000, maxOutput: 16384, pricing: { input: 15.00, output: 75.00 }, capabilities: ['chat', 'vision', 'tools', 'streaming', 'computer-use'], }, { id: 'claude-sonnet-4-20250514', name: 'Claude Sonnet 4', provider: 'anthropic', contextWindow: 200000, maxOutput: 16384, pricing: { input: 3.00, output: 15.00 }, capabilities: ['chat', 'vision', 'tools', 'streaming'], }, // Google Models { id: 'gemini-2.5-pro', name: 'Gemini 2.5 Pro', provider: 'google', contextWindow: 2000000, maxOutput: 65536, pricing: { input: 1.25, output: 5.00 }, capabilities: ['chat', 'vision', 'tools', 'streaming'], }, // xAI Models { id: 'grok-4', name: 'Grok 4', provider: 'xai', contextWindow: 131072, maxOutput: 16384, pricing: { input: 5.00, output: 15.00 }, capabilities: ['chat', 'tools', 'streaming'], }, // Local Models (via Gateway) { id: 'llama-3.3-70b', name: 'Llama 3.3 70B', provider: 'gateway', contextWindow: 131072, maxOutput: 8192, pricing: { input: 0, output: 0 }, // Free (local) capabilities: ['chat', 'tools', 'streaming'], }, ]; export const defaultModelParams = { temperature: 0.7, maxTokens: 4096, topP: 1.0, frequencyPenalty: 0, presencePenalty: 0, };Provider Configuration
export const providers = { openai: { baseURL: process.env.OPENAI_BASE_URL || 'https://api.openai.com/v1', apiKey: process.env.OPENAI_API_KEY, organization: process.env.OPENAI_ORGANIZATION, }, anthropic: { baseURL: 'https://api.anthropic.com', apiKey: process.env.ANTHROPIC_API_KEY, version: '2023-06-01', }, google: { baseURL: 'https://generativelanguage.googleapis.com/v1beta', apiKey: process.env.GOOGLE_AI_API_KEY, }, xai: { baseURL: 'https://api.x.ai/v1', apiKey: process.env.XAI_API_KEY, }, gateway: { baseURL: process.env.GATEWAY_URL || 'http://localhost:8080', apiKey: process.env.GATEWAY_API_KEY, }, };MCP Configuration
Configure Model Context Protocol servers.
MCP Configuration File
{ "mcpServers": { "cal2prompt": { "command": "npx", "args": ["-y", "cal2prompt"], "env": {}, "enabled": true, "description": "Calendar integration for reading and managing events" }, "switchbot": { "command": "node", "args": ["mcp-servers/switchbot/build/index.js"], "env": { "SWITCHBOT_TOKEN": "${SWITCHBOT_TOKEN}", "SWITCHBOT_SECRET": "${SWITCHBOT_SECRET}" }, "enabled": true, "description": "Control SwitchBot smart home devices" }, "jgrants": { "command": "npx", "args": ["-y", "@kimtaeyoon83/mcp-server-jgrants"], "env": {}, "enabled": true, "description": "Search for research grants and funding opportunities" }, "custom-api": { "command": "python", "args": ["-m", "mcp_servers.custom_api"], "env": { "CUSTOM_API_KEY": "${CUSTOM_API_KEY}" }, "enabled": false, "description": "Custom API integration" } }, "defaults": { "timeout": 30000, "retries": 3, "logLevel": "info" } }Loading MCP Configuration
import mcpConfig from '@/mcp-config.json'; export function loadMCPConfig() { const config = { ...mcpConfig }; // Replace environment variables for (const [name, server] of Object.entries(config.mcpServers)) { if (server.env) { for (const [key, value] of Object.entries(server.env)) { if (typeof value === 'string' && value.startsWith('${')) { const envVar = value.slice(2, -1); server.env[key] = process.env[envVar] || ''; } } } // Check if enabled via environment const enabledEnvVar = `MCP_${name.toUpperCase()}_ENABLED`; if (process.env[enabledEnvVar] !== undefined) { server.enabled = process.env[enabledEnvVar] === 'true'; } } return config; }RAG Configuration
Fine-tune RAG system parameters for optimal retrieval.
RAG Settings
export const ragConfig = { // Embedding Configuration embedding: { model: 'text-embedding-3-large', dimensions: 3072, batchSize: 100, }, // Chunking Configuration chunking: { strategy: 'recursive', // 'recursive' | 'semantic' | 'raptor' chunkSize: 1000, chunkOverlap: 200, separators: [' ', ' ', '. ', ' ', ''], }, // RAPTOR Configuration raptor: { enabled: true, levels: 3, clusteringAlgorithm: 'kmeans', summaryModel: 'gpt-4o-mini', }, // Retrieval Configuration retrieval: { k: 5, // Number of chunks to retrieve similarityThreshold: 0.7, // Minimum similarity score hybridSearch: true, // Combine vector + keyword search alpha: 0.7, // Weight for vector search (0-1) }, // Reranking Configuration reranking: { enabled: true, model: 'cross-encoder/ms-marco-MiniLM-L-6-v2', topK: 3, // Re-rank top K results }, // Knowledge Graph Configuration knowledgeGraph: { enabled: true, extractionModel: 'gpt-4o', entityTypes: ['PERSON', 'ORG', 'LOCATION', 'DATE', 'CONCEPT'], relationshipTypes: ['WORKS_AT', 'LOCATED_IN', 'RELATED_TO'], communityDetection: true, }, // Caching Configuration cache: { enabled: true, ttl: 3600, // 1 hour in seconds maxSize: 1000, // Max cached queries }, // Vector Database Configuration vectorDB: { provider: 'pgvector', // 'pgvector' | 'valkey' pgvector: { tableName: 'document_chunks', indexType: 'hnsw', m: 16, efConstruction: 64, }, valkey: { host: 'localhost', port: 6379, indexName: 'extendedlm-vectors', algorithm: 'HNSW', m: 16, efConstruction: 200, }, }, };Gateway Configuration
Configure the Rust-based Gateway for local LLM inference.
Gateway Config File
# File: gateway/config.toml
[server]
host = "0.0.0.0"
port = 8080
workers = 4
[llama_cpp]
host = "http://localhost:8081"
n_gpu_layers = 35
n_ctx = 131072
n_batch = 512
n_threads = 8
use_mmap = true
use_mlock = false
[models]
default_model = "llama-3.3-70b"
[[models.available]]
id = "llama-3.3-70b"
name = "Llama 3.3 70B"
path = "/models/llama-3.3-70b-instruct.gguf"
context_length = 131072
rope_freq_base = 500000.0
rope_freq_scale = 1.0
[[models.available]]
id = "qwen2.5-72b"
name = "Qwen 2.5 72B"
path = "/models/qwen2.5-72b-instruct.gguf"
context_length = 131072
[logging]
level = "info"
file = "/var/log/gateway.log"
[rate_limiting]
enabled = true
requests_per_minute = 60
burst_size = 10
[cache]
enabled = true
ttl = 3600
max_size_mb = 1024
Building Gateway
# Build Gateway
cd gateway
cargo build --release
# Copy binary
cp target/release/gateway /usr/local/bin/
# Create systemd service
cat <<EOF | sudo tee /etc/systemd/system/gateway.service
[Unit]
Description=ExtendedLM Gateway
After=network.target
[Service]
Type=simple
User=gateway
WorkingDirectory=/opt/extendedlm/gateway
ExecStart=/usr/local/bin/gateway --config /opt/extendedlm/Gateway/config.toml
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
# Enable and start
sudo systemctl enable gateway
sudo systemctl start gateway
GPU Configuration
# NVIDIA GPU (CUDA)
export CUDA_VISIBLE_DEVICES=0
export LLAMA_CUDA=1
# AMD GPU (ROCm)
export HIP_VISIBLE_DEVICES=0
export LLAMA_HIPBLAS=1
# Apple Silicon (Metal)
export LLAMA_METAL=1
# Configure GPU layers in config.toml
# More layers = faster but more VRAM usage
# For 70B model on 24GB VRAM: ~35 layers
# For full offload on 80GB VRAM: -1 (all layers)
Docker Deployment
Deploy ExtendedLM using Docker Compose.
Docker Compose Configuration
# File: docker-compose.yml
version: '3.8'
services:
# Next.js Application
app:
build:
context: .
dockerfile: Dockerfile
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://postgres:password@postgres:5432/extendedlm
- VALKEY_URL=redis://valkey:6379
- GATEWAY_URL=http://gateway:8080
- LMMATE_SERVER_URL=http://lmmate:8000
depends_on:
- postgres
- valkey
- gateway
- lmmate
networks:
- extendedlm
restart: unless-stopped
# PostgreSQL with pgvector
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: extendedlm
volumes:
- postgres_data:/var/lib/postgresql/data
- ./scripts/init.sql:/docker-entrypoint-initdb.d/init.sql
ports:
- "5432:5432"
networks:
- extendedlm
restart: unless-stopped
# Valkey (Redis-compatible) for vector search
valkey:
image: valkey/valkey:latest
command: valkey-server --loadmodule /usr/lib/redis/modules/redisearch.so
ports:
- "6379:6379"
volumes:
- valkey_data:/data
networks:
- extendedlm
restart: unless-stopped
# Gateway (Rust + llama.cpp)
gateway:
build:
context: ./gateway
dockerfile: Dockerfile
ports:
- "8080:8080"
volumes:
- ./models:/models
- ./Gateway/config.toml:/app/config.toml
environment:
- CUDA_VISIBLE_DEVICES=0
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
networks:
- extendedlm
restart: unless-stopped
# Mate (Python FastAPI)
mate:
build:
context: ./mate
dockerfile: Dockerfile
ports:
- "8000:8000"
- "5900:5900" # VNC
environment:
- DISPLAY=:99
- LMMATE_SERVER_API_KEY=${LMMATE_SERVER_API_KEY}
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- extendedlm
restart: unless-stopped
networks:
extendedlm:
driver: bridge
volumes:
postgres_data:
valkey_data:
Application Dockerfile
# File: Dockerfile
FROM node:20-alpine AS base
# Install dependencies only when needed
FROM base AS deps
RUN apk add --no-cache libc6-compat
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
# Rebuild the source code only when needed
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
# Set environment variables for build
ENV NEXT_TELEMETRY_DISABLED=1
ENV NODE_ENV=production
RUN npm run build
# Production image
FROM base AS runner
WORKDIR /app
ENV NODE_ENV=production
ENV NEXT_TELEMETRY_DISABLED=1
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
USER nextjs
EXPOSE 3000
ENV PORT=3000
ENV HOSTNAME="0.0.0.0"
CMD ["node", "server.js"]
Starting Services
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f app
# Stop services
docker-compose down
# Rebuild and restart
docker-compose up -d --build
# Scale app service
docker-compose up -d --scale app=3
Production Deployment
Best practices for deploying ExtendedLM in production.
Vercel Deployment
# Install Vercel CLI
npm i -g vercel
# Login
vercel login
# Deploy
vercel
# Deploy to production
vercel --prod
# Set environment variables
vercel env add DATABASE_URL
vercel env add OPENAI_API_KEY
# ... add all required env vars
Vercel Configuration
// File: vercel.json
{
"framework": "nextjs",
"buildCommand": "npm run build",
"devCommand": "npm run dev",
"installCommand": "npm install",
"regions": ["sfo1"],
"env": {
"NEXT_PUBLIC_APP_URL": "https://yourdomain.com"
},
"functions": {
"app/api/**/*.ts": {
"maxDuration": 60
}
},
"headers": [
{
"source": "/api/:path*",
"headers": [
{ "key": "Access-Control-Allow-Origin", "value": "*" },
{ "key": "Access-Control-Allow-Methods", "value": "GET,POST,PUT,DELETE,OPTIONS" },
{ "key": "Access-Control-Allow-Headers", "value": "Content-Type, Authorization" }
]
}
]
}
Self-Hosted (Ubuntu Server)
# Update system
sudo apt update && sudo apt upgrade -y
# Install Node.js 20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs
# Install PostgreSQL 16 with pgvector
sudo apt install -y postgresql-16 postgresql-16-pgvector
# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Clone repository
git clone https://github.com/yourusername/extendedlm.git
cd extendedlm
# Install dependencies
npm install
# Build application
npm run build
# Install PM2 for process management
npm install -g pm2
# Start with PM2
pm2 start npm --name "extendedlm" -- start
pm2 save
pm2 startup
# Configure Nginx reverse proxy
sudo apt install -y nginx
sudo nano /etc/nginx/sites-available/extendedlm
# Nginx config (see below)
sudo ln -s /etc/nginx/sites-available/extendedlm /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
Nginx Configuration
# File: /etc/nginx/sites-available/extendedlm
upstream nextjs {
server localhost:3000;
}
upstream gateway {
server localhost:8080;
}
server {
listen 80;
server_name yourdomain.com;
# Redirect to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name yourdomain.com;
# SSL certificates (use certbot)
ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
# Security headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
add_header X-XSS-Protection "1; mode=block" always;
# Next.js app
location / {
proxy_pass http://nextjs;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Gateway API
location /Gateway/ {
proxy_pass http://Gateway/;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
# SSE (Server-Sent Events)
location /api/chat/stream {
proxy_pass http://nextjs;
proxy_http_version 1.1;
proxy_set_header Connection '';
proxy_buffering off;
proxy_cache off;
chunked_transfer_encoding off;
}
}
SSL Certificate (Let's Encrypt)
# Install certbot
sudo apt install -y certbot python3-certbot-nginx
# Obtain certificate
sudo certbot --nginx -d yourdomain.com
# Auto-renewal
sudo certbot renew --dry-run
Database Setup
Initialize PostgreSQL database with required extensions and schema.
Monitoring & Logging
Set up monitoring and logging for production systems.
Application Logging
import winston from 'winston';
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
transports: [
// Console output
new winston.transports.Console({
format: winston.format.combine(
winston.format.colorize(),
winston.format.simple()
),
}),
// File output
new winston.transports.File({
filename: 'logs/error.log',
level: 'error',
}),
new winston.transports.File({
filename: 'logs/combined.log',
}),
],
});
export default logger;
Performance Monitoring
import { performance } from 'perf_hooks';
export function measurePerformance(name: string) {
const start = performance.now();
return {
end: () => {
const duration = performance.now() - start;
logger.info(`[PERF] ${name}: ${duration.toFixed(2)}ms`);
return duration;
},
};
}
// Usage
const perf = measurePerformance('API Call');
await someAPICall();
perf.end();
Health Check Endpoint
export async function GET() { const checks = { database: await checkDatabase(), gateway: await checkGateway(), valkey: await checkValkey(), mate: await checkMate(), }; const allHealthy = Object.values(checks).every((status) => status === 'ok'); return Response.json( { status: allHealthy ? 'healthy' : 'degraded', checks, timestamp: new Date().toISOString(), }, { status: allHealthy ? 200 : 503 } ); } async function checkDatabase() { try { await db.query('SELECT 1'); return 'ok'; } catch { return 'error'; } }Backup & Recovery
Implement backup strategies for data protection.
Database Backup
#!/bin/bash
# File: scripts/backup.sh
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backups"
DB_NAME="extendedlm"
# PostgreSQL backup
pg_dump -U postgres $DB_NAME | gzip > "$BACKUP_DIR/db_$DATE.sql.gz"
# Valkey backup
valkey-cli --rdb "$BACKUP_DIR/valkey_$DATE.rdb"
# Cleanup old backups (keep 30 days)
find $BACKUP_DIR -name "db_*.sql.gz" -mtime +30 -delete
find $BACKUP_DIR -name "valkey_*.rdb" -mtime +30 -delete
echo "Backup completed: $DATE"
Automated Backups (Cron)
# Add to crontab
crontab -e
# Daily backup at 2 AM
0 2 * * * /opt/extendedlm/scripts/backup.sh >> /var/log/backup.log 2>&1
Restore from Backup
# Restore PostgreSQL
gunzip -c /backups/db_20250101_020000.sql.gz | psql -U postgres extendedlm
# Restore Valkey
valkey-cli --rdb /backups/valkey_20250101_020000.rdb
Security Best Practices
Implement security measures to protect your deployment.
Environment Security
- Never commit .env files: Add to .gitignore
- Use strong secrets: Generate with
openssl rand -base64 32 - Rotate API keys: Regularly rotate sensitive keys
- Limit permissions: Use principle of least privilege
- Enable 2FA: Require two-factor authentication
Database Security
- Row-Level Security: Enable RLS on all tables
- Prepared Statements: Prevent SQL injection
- Encrypted Connections: Use SSL/TLS for database connections
- Regular Backups: Automated daily backups
API Security
- Rate Limiting: Prevent abuse with rate limits
- API Keys: Require authentication for all endpoints
- CORS: Configure allowed origins
- Input Validation: Validate and sanitize all inputs
- HTTPS Only: Redirect HTTP to HTTPS
Docker Security
- Non-root User: Run containers as non-root
- Read-only Filesystem: Use read-only where possible
- Resource Limits: Set CPU and memory limits
- Network Isolation: Use Docker networks
- Scan Images: Regularly scan for vulnerabilities